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ABSTRACT 


Hansen,  Steven  R.  M.S.,  Department  of  Computer  Science, 
Wright  State  University,  1986.  An  Empirical  Study  in  the 
Simulation  of  Heuristic  Error  Behavior. 

'~^)Many  artificial  intelligence  programs  deal  in  searching 
for  solutions  from  many  alternatives,  and  depend  upon  heur¬ 
istics  to  guide  the  search  and  to  insure  quality  results. 
This  thesis  documents  empirical  research  performed  in 

Xft  ^’  s  S' 

searching  problems  and  heuristic  behavior.^  A  general  model 
is  presented  that  defines  a  broad  class  of  searching 
domains,  along  with  a  set  of  software  tools  designed  to 
support  research  in  them.  One  puzzle  configuration  is 
devised  from  the  model  and  studied  in  depth,  examining 
several  subtle  variations  of  the  A*  searching  algorithm, 
and  results  are  compared  with  related  work  from  other 
similar  domains. - 

Heuristics  are  then  examined  as  statistical  entities  in 
an  attempt  to  substantiate  theoretical  work  into  the  equi¬ 
valence  of  heuristics,  and  to  verify  if  the  statistical 
descriptions  alone  are  sufficient  to  simulate  the  perform¬ 
ance  of  the  actual  heuristic.  The  technique  of  simulation 
by  statistical  profile  uncovers  some  subtle  performance 
trends,  and  promises  to  be  a  useful  research  tool  in  focus¬ 
ing  on  particular  aspects  of  heuristic  behavior. 


TABLE  OF  CONTENTS  (CONTINUED) 


C.  Contrived  Heuristics  . 

1.  Normal  Distribution  . 

2.  Actual  Distribution  . 

3.  Worst-Case  Distribution  . 

4.  Mechanical  Issues  . 

D.  Empirical  Results  . 

1.  Run  Profiles  . 

2.  Complexity  Performance  Results  . 

a.  Normal  and  Actual  Distribution  Results 

1.  Range  of  Effectiveness  . 

2 .  Timing  . 

3.  Weight  . 

b.  Worst-Case  Distribution  Results  . 

E.  Conclusion  . 

Appendices 

A.  General  Beads  World  Tools  Modules  . 

B.  Beads  World  Tools  Definition  Modules  . 

C.  Beads  World  Applications  Modules  . 

D.  Distribution  Disk  Contents  . 

E.  Key  to  Graph  Abreviations  . 

Bibliography  . . . 


LIST  OF  FIGURES 


Figure 

3.1  A  Portion  of  the  8-Puzzle  State  Space 

3.2  Equivalent  8-Puzzle  Instantiations 

3.3  Beads  World  Variations  . 

5.1  Search  Tree  (Initially)  . 

5.2  Search  Tree  (Intermediate)  . 

5.3  Search  Tree  (Final)  . 

5.4  Search  Graph  and  Tree  (Initially) 

5.5  Search  Graph  and  Tree  (Final)  . 


Graph 

vs 

Ordered 

Search 

5.6 

Kl, 

K2 ,  K3 

;  W=o . 2 ;  XMEAN 

5.7 

Kl, 

K2 ,  K3 

;  W=0. 5 ;  XMEAN 

5.8 

Kl, 

K2,  K3 

;  W=0 . 7 ;  XMEAN 

5.9 

Kl; 

W=0 . 9 ; 

XMEAN,  XMAX  . 

5.10 

K2  7 

W=0 . 9 ; 

XMEAN,  XMAX  . 

5.11 

K3 ; 

W=0 . 9 ; 

XMEAN,  XMAX  . 

5.12 

Kl; 

W=l.O; 

XMEAN,  XMAX  . 

5.13 

K2; 

W=1.0; 

XMEAN,  XMAX  . 

5.14 

K3 ; 

W=l. 0; 

XMEAN,  XMAX  . 

5.15 

Kl; 

W=0 . 9 , 

1.0;  LMEAN  . . 

5.16 

K2; 

W=0 . 9 , 

1.0;  LMEAN  .. 

5.17 

K3 ; 

W=0 . 9 , 

1.0;  LMEAN  . . 

viii 


TABLE  OF  FIGURES  (CONTINUED) 


Ordered  Search  Discriminator  Comparison 


5.18 

Kl, 

K2,  K3 

5.19 

Kl, 

K2,  K3 

5.20 

Kl, 

K2,  K3 

5.21 

Kl; 

W=1 . 0 ; 

5.22 

K2 ; 

W=1 . 0 ; 

5.23 

K3 ; 

W=1.0; 

5.24 

Kl; 

W*0 . 9 , 

5.25 

K2 ; 

W=0 . 9 , 

5.26 

K3; 

W=0 . 9 , 

W=0.2;  XMEAN 
W=0.5;  XMEAN 
W=0.7 ;  XMEAN 

XMEAN  . 

XMEAN  . 

XMEAN  . 

1.0;  LMEAN  .. 
1.0;  LMEAN  . . 
1.0;  LMEAN  . . 


Page 

78 

79 

80 
81 
82 

83 

84 

85 

86 


> 

I 

I 

) 

I 


TABLE  OF  FIGURES  (CONTINUED) 


6-Puzzle  vs  8-Puzzle 


6.1  K1 

6.2  K2 

6.3  K3 

6.4  K1 

6.5  K3 

6.6  K1 

6.7  K2 

6.8  K3 

6.9  K2 

6.10  K1 

6.11  K1 

6.12  K2 

6.13  K3 

6.14  K1 

6.15  K1 

6.16  K2 

6.17  K3 

6.18  K2 

6.19  K1 

6.20  K2 

6.21  K3 


W=0 . 5 ;  XMIN,  XMEAN,  XMAX  . 

W=0 . 5 ;  XMIN,  XMEAN,  XMAX  . 

W=0 . 5 ;  XMIN,  XMEAN,  XMAX  . 

K2,  K3 ;  W=0 . 5 ;  XMEAN  . 

W=0 . 5 ;  LMIN,  LMEAN,  LMAX  . 

W=0 . 2 ,  0.5,  0.7,  1.0;  XMEAN  . 

W=0 . 2 ,  0.5,  0.7,  1.0;. XMEAN  . 

W=0 . 2 ,  0.5,  0.7,  1.0;  XMEAN  . 

Various  N;  XMEAN  . 

K2,  K3 ;  W=1 . 0 ;  XMEAN  . 

W=0 . 7 ,  0.8,  0.9,  l.o;  LMEAN  . 

W=0 . 7 ,  0.8,  0.9,  1.0;  LMEAN  . 

W=0 . 5 ,  0.7,  0.9,  1.0;  LMEAN  . 

K2,  K3 ;  W=l. 0 ;  LMEAN  . 

Various  N;  LMEAN  . 

Various  N ;  LMEAN  . . . 

Various  N;  LMEAN  . 

W=0 . 2 ,  0.5,  0.7,  1.0;  XMAX  . . 

KMIN,  KMEAN,  KMAX  . . 

KMIN,  KMEAN,  KMAX  . . 

KMIN,  KMEAN,  KMAX  . 


TABLE  OF  FIGURES  (CONTINUED) 

Beads  World  Tools 

7.1  Puzzle  Node  Record  Structure  . 

7.2a  Neighbor  Node  List  . 

7.2b  Example  Search  Tree  for  the  3-Puzzle  . 

7.2c  Example  Graph  for  the  3-Puzzle  . 

7.3  Puzzle  Description  for  a  State  of  the  5-Puzzle 

7.4  Graph  Generation  Algorithm  . 

7 . 5  Graph  Descriptor  Data  Structure  . 

7.6  Data  Structures  for  3-Puzzle  Graph  . 

7.7  Profile  Database  Structures  . 

7.8a  Normal  Distribution  Density  Curve  . 

7 . 8b  Distribution  Function  for  the  Density  Curve 

of  Figure  7.8a  . 

7.9  Example  of  Data  in  a  Profile  Auxiliary  File  .. 

7.10  Distribution  File  Format  . 

7.11  Sample  Input  data  for  GRAPH_SPACE  . 

7.12  Puzzle  States  for  the  3-Puzzle  . 

7.13  Sample  Input  data  for  SOLVE  . 

7.14  Printed  SOLVE  Results  . 


TABLE  OF  FIGURES  (CONTINUED) 
Simulation  Result  Graphs 

8.1  8-Puzzle  vs  6-Puzzle  Source  Profile,  K3 
Source  Profiles 


Run  Profiles 


8.10  K6 

8.11  K7 

8.12  K8 

8. 


8.14  K10  . 

8.15  Kll  . 

8.16  K12  . 

Complexity  Graphs 


8.17 

SET 

Kl; 

W=0 . 2 , 

0.5,  0.7, 

0.8, 

0.9; 

XMEAN 

8.18 

SET 

K2; 

W=0 . 2 , 

0.5,  0.7, 

0.8, 

0.9; 

XMEAN 

8.19 

SET 

K3 ; 

W=0.2, 

0.5,  0.7, 

0.8, 

0.9; 

XMEAN 

8.20 

SET 

Kl; 

W=0 . 9 ; 

LMEAN  . . 

<*.21 

SET 

K2 ; 

W=0. 9 ; 

LMEAN  . . 

8.22 

SET 

K2 ; 

W*0 . 9 ; 

LMEAN  . . 

LIST  OF  TABLES 

Table  Page 

3.1  6-Puzzle  Configuration  1  24 

3.2  6-Puzzle  Configuration  2  25 

3.3  6-Puzzle  Configuration  3  26 

3.4  6-Puzzle  Configuration  4  27 

3.5  6-Puzzle  Configuration  5  28 

3.6  6-Puzzle  Configuration  6  29 

3.7  6-Puzzle  Configuration  7  30 

3.8  6-Puzzle  Configuration  8  31 

3.9  6-Puzzle  configuration  9  32 

3.10  6-Puzzle  Configuration  10  33 

3.11  6-Puzzle  Configuration  11  34 

3.12  6-Puzzle  Configuration  12  35 

4.1  Sample  Summary  . .  43 

5.1  Savings  using  Graph  Search  over  Ordered  Search, 

Weight  **  0.9  70 

5.2  Savings  using  Graph  Search  over  Ordered  Search, 

Weight  ■  1.0  71 

175 

176 


8.1  Profile  Sample  Sizes  .... 

8.2  Disparity  of  Distributions 

8.3  Legend  of  Heuristic  Names 


192 


DEDICATION 


To  Libby,  my  wife, 
for  her  constant  love  and  support, 


and 


to  Kristina,  Daniel,  and  Trieste,  my  children 
for  their  patience  and  understanding 
during  the  many  long  hours  I  have  been  away 
while  involved  with  this  effort. 


I .  INTRODUCTION 


In  the  early  1800's,  travel  to  the  west  coast  beyond 
the  Mississippi  River  was  long  and  hazardous.  No  formal 
roads  or  trails  existed,  maps  were  primitive  and 
inaccurate,  supplies  along  the  way  were  scarce,,  and  many 
regions  were  inhabited  by  hostile  indians.  Before 
embarking  on  such  a  trek,  travelers  would  assemble 
themselves  into  wagon  trains,  and  would  enlist  the  services 
of  a  guide  or  scout  who  had  some  knowledge  of  the 
destination  and  terrain  along  the  way  to  help  select  the 
best,  quickest  and  safest  route  to  their  goal.  He  was  the 
one  the  wagon  master  would  ask  when  a  choice  in  directions 
was  necessary.  Chances  were  remote  that  the  guide  had  seen 
the  area  in  question  before  to  know  the  exact  answer,  and 
instead,  had  to  examine  the  clues  available  like  the 
terrain  and  land  features  before  making  an  'educated'  guess 
as  to  what  the  best  direction  might  be.  An  incorrect 
choice  on  his  part  could  add  days  to  the  trek,  or  lead  the 
group  into  barren  or  hostile  territory  where  the  results 
could  be  fatal. 

Today,  Artificial  Intelligence  techniques  are  being 
used  in  an  increasing  number  of  computer  applications, 
varying  from  speech  recognition  to  molecule  synthesis, 
robotics  to  expert  systems.  While  the  area  of  application 


is  quite  broad,  virtually  every  AI  based  program  uses  some 
form  of  heuristic  or  guide  to  assist  in  selecting  from 
among  several  choices  or  alternatives  en  route  to  a 
solution  of  the  problem  at  hand.  These  heuristics  are  to 
an  AI  program  what  the  guide  was  to  the  old-time  wagon 
train,  and  can  vary  from  almost  perfectly  educated  to 
almost  completely  uninformed.  The  perfectly  educated  guide 
leads  without  deviation  to  the  goal,  while  the  incorrect 
directions  supplied  by  the  misinformed  guide  can  lead  to 
anything  from  an  occasional,  distracting  detour  to  aimless 
meandering  that  never  locates  the  destination.  Just  as  the 
wagon  master  might  have  bene fitted  by  some  technique  to 
evaluate  the  scouting  ability  of  his  prospective  guide 
prior  to  their  journey,  the  computer  scientist  could 
benefit  by  having  some  measures  to  predict  the 
effectiveness  of  the  heuristic  guiding  his  program's  paths. 

Some  investigation  and  research  has  already  been 
conducted  regarding  the  prediction  of  a  heuristic's 
performance,  inluding  the  effects  of  weighting,  comparison 
of  heuristics  of  differing  ability,  and  error  behavior  of 
heuristics.  Some  theoretical  work  has  been  done  also.  Our 
aim  was  to  select  a  new  domain  related  to  this  other  work, 
and  perform  empirical  studies  of  our  own.  We  restricted 
our  work  to  the  performance  of  heuristics  using  the  A* 
search  algorithm,  which  solves  path-finding  problems  in 
strongly  connected  finite  graphs.  We  hope  that  our 


results,  when  combined  with  other  related  research  in  an 
increasing  variety  of  domains,  will  illuminate  shared  and 
common  patterns,  and  that  some  encompassing  theories  will 
evolve  as  a  result. 

A.  GOALS 

Specifically,  our  goals  were: 

(1)  to  gather  a  significant  amount  of  data  on 
heuristic  performance  in  one  domain.  We  used  a  sliding- 
tile  problem,  similar  to  the  8-Puzzle  (to  be  described 
later)  for  our  domain. 

(2)  The  A*  algorithm  we  used  to  study  heuristic 
behavior  has  two  variations  called  Ordered  Search  and  Graph 
Search.  Nilsson  (1980)  presented  the  Graph  Search 
variation  and  advertises  it  as  being  less  redundant  than 
Ordered  Search.  While  his  argument  for  Graph  Search  is 
intuitively  appealing,  we  know  of  no  research  that  provides 
empirical  results  comparing  the  two  methods  to  validate  his 
claims.  We  encoded  both  variations  and  ran  them  on  a 
common  set  of  data  to  compare  the  results. 

(3)  We  wanted  to  compare  our  results  directly  with 
those  gathered  in  other  domains.  Gaschnig  (1979)  compiled 
a  fairly  complete  set  of  empirical  studies  using  a  similar 
sliding-tiles  problem.  We  followed  his  methodology,  used 
the  same  heuristics,  and  compared  our  results  with  his. 
Using  the  same  heuristics,  how  do  they  perform  in  a 
different  domain?  What  inter-search-space  patterns  in  mean 


complexity  and  solution  quality  appear? 

(4)  We  wanted  to  characterize  several  heuristics  in 
terms  of  their  average  statistical  behavior  (we  call  these 
profiles)  that  show  the  range  of  values  the  heuristic 
returned  compared  to  the  actual  distance.  This  provides 
insight  into  the  error  behavior  of  the  heuristic. 

(5)  Gaschnig  (1979)  claims  that  heuristics  with 
identical  profiles  can  be  termed  'equivalent*  and  that 
their  efficiency  will  be  predictably  the  same.  Using  the 
profiles  gathered  in  4,  we  wanted  to  simulate  classes  of 
heuristics  with  duplicate  behavior  and  review  the  results 
to  verify  these  claims.  Would  their  results  be  the  same  if 
they  shared  the  same  profile? 

(6)  Also  using  the  profiles  gathered  in  4,  we  wanted 
to  see  how  accurately  and  completely  a  statistical  profile 
captured  the  'intuition1  of  the  original  heuristic  within 
the  same  domain.  Can  a  heuristic  be  described  completely 
using  statistical  performance  summaries  alone? 

(7)  Finally,  we  wanted  to  leave  behind  a  set  of  tools 
that  were  general  enough  to  be  used  by  future  researchers 
to  gather  additional  data  in  related  domains. 

This  document  will  proceed  by  describing  the  general 
domain  we  used  for  our  research,  and  will  show  that  sliding 
tile  problems  like  the  8-Puzzle  belong  to  a  broad  class  of 
related  puzzles  of  varying  complexity.  Our  sample¬ 
gathering  technique  will  be  explained,  followed  by  a 
comparison  of  the  Ordered  Search  and  Graph  Search  A* 
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algorithm  variations.  We  will  then  compare  the  6-Puzzle  to 
related  work  on  the  8-Puzzle  by  other  researchers.  We  will 
then  describe  the  programs  and  tools  we  used,  and  show  the 
range  of  puzzles  over  which  they  are  designed  to  operate. 
Finally,  this  thesis  will  conclude  by  discussing  a 
technique  wherein  the  error  behavior  of  a  given  heuristic 
is  captured  as  a  statistical  ’profile1,  which  is  then  used 
to  simulate  the  behavior  of  other  contrived  heuristics. 

A  good  deal  of  work  in  this  thesis  represents  the 
combined  efforts  of  Steven  Hansen  and  Alan  Cotterman, 
including  the  code  used  to  gather  the  empirical  results  and 
much  of  the  foundational  aspects  documented  in  the  initial 
seven  chapters  herein.  More  information  about  this  topic 
can  be  obtained  in  the  thesis  of  Alan  Cotterman  entitled, 
"An  Empirical  Study  in  the  Modelling  of  Heuristic  Error 
Behavior" . 


II. 


BASIC  SEARCH  CONCEPTS 
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As  a  foundation  for  each  of  the  chapters  to  follow,  we 
briefly  describe  the  search  technique  concepts,  including 


The  search  process  maintains  a  search  graph  composed  of 
nodes  characterizing  the  individual  elements  discovered 


within  the  state  space.  Each  node  consists  of  a 
description  of  an  element's  unique  state  or  configuration. 
Any  two  nodes  in  the  graph  are  connected  to  one  another  by 
an  undirected  arc  only  when  the  application  of  any  of  the 
rules  or  operators  defining  the  state  space  can  create  one 
element  from  the  other.  In  addition  to  node 
relationships,  arcs  show  the  cost  of  generating  the 
'neighbor'  node  from  the  original.  For  our  purposes,  cost 
will  be  uniform  for  all  arcs  and  will  represent  simply  a 
single  state  tranformation  operation  between  nodes. 

If  one  can  begin  at  any  given  node,  traverse  a  series 
of  arcs,  and  arrive  at  another  node,  then  a  path  is  said  to 
exist  between  the  two,  and  the  cost  of  the  path  is  simply 
the  number  of  arcs  traversed  between  them.  In  a  graph 
setting,  more  than  one  path  may  exist  between  two  given 
nodes,  and  the  various  paths  may  be  of  different  lengths 
(or  "costs").  Naturally,  when  this  occurs,  one  wishes  to 
select  the  shortest  (least  cost)  path  of  the  set. 

While  the  graph  maintains  arcs  permitting  any  and  all 
paths  to  be  traced  between  two  nodes,  it  doesn't  keep  track 
of  which  one  is  the  shortest.  For  this  reason,  a  search 
tree  must  also  be  maintained.  A  search  tree  is  a  specific 
case  of  a  graph  where  any  given  node  can  only  have  one 
parent.  The  search  tree  connects  nodes  with  directed  arcs, 
pointing  backwards  from  the  generated  node  (successor)  to  a 
single  parent  node.  As  the  search  process  proceeds,  and 


multiple  paths  are  discovered  to  a  single  successor 
configuration  (in  effect,  giving  that  child  more  than  one 
parent) ,  it  becomes  necessary  to  select  the  parent  (or 
path)  with  the  lowest  cost,  redirecting  the  parent  pointer 
as  needed.  (The  multiple  relationship  is  still  maintained 
via  undirected  arcs  in  the  graph,  however.)  When  the 
search  process  terminates  by  finding  the  goal  sought, 
traversing  the  path  established  in  the  search  tree  by  the 
parent  pointers  gives  the  sequence  of  state  transformations 
needed  to  go  from  the  start  to  the  goal  state. 

B.  GENERAL  SEARCH  ALGORITHM 

Nilsson  (1980,  Pp  64-65)  presents  an  algorithm  that 
solves  searching  problems  in  strongly  connected,  finite 
graphs.  (He  calls  the  algorithm  "Graphsearch" ,  which 
should  not  be  confused  with  an  updating  variation  to  be 
examined  in  Chapter  V  called  "Graph  Search".)  The 
algorithm  builds  the  state  space  graph  beginning  with  a 
start  state  as  a  'seed'  and  systematically  generates  the 
graph  (G)  and  search  tree  around  it  until  the  goal  is  found 
(success)  or  until  all  possible  moves  have  been  discovered 
(failure) .  A  node  is  said  to  be  expanded  when  all  of  its 
successors  have  been  generated,  that  is,  when  all  possible 
configurations  one  move  or  step  away  have  been  obtained  by 
applying  the  operators  mentioned  above. 

To  control  which  nodes  have  been  expanded  and  which 
remain  to  be,  this  algorithm  uses  two  bookkeeping  lists 
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called  OPEN  and  CLOSED.  Expanded  nodes  are  placed  on  the 
CLOSED  list,  while  those  awaiting  expansion  remain  on  the 
OPEN  list.  The  algorithm  iteratively  removes  a  node  from 
OPEN,  places  it  on  CLOSED,  and  expands  it,  placing  each 
successor  generated  onto  the  OPEN  list,  repeating  this 
sequence  until  it  finds  the  goal  or  until  OPEN  is  empty 
(i.e.  no  more  states  can  be  generated).  The  graph  G 
collects  all  the  paths  discovered  to  each  of  the  generated 
nodes,  while  the  best  path  is  shown  in  the  search  tree, 
using  the  parent  pointers  maintained  in  step  7  (listed 
below) .  When  the  algorithm  terminates  successfully  (having 
found  the  goal  on  OPEN) ,  the  solution  path  can  be  traced 
backwards  from  that  node  to  each  of  the  ancestors  up  to  the 
start  node,  giving  the  solution  path.  The  algorithm 
follows,  and  is  copied  from  Nilsson  (1980) . 

1.  Create  a  new  search  graph,  G,  consisting 
solely  of  the  start  node,  s.  Put  s  on  a  list 
called  OPEN. 

2.  Create  a  list  called  CLOSED  that  is  initially 
empty . 

3.  If  OPEN  is  empty,  exit  with  failure. 

4.  Select  the  first  node  on  OPEN,  remove  it  from 
OPEN,  and  put  it  on  CLOSED.  Call  this  node  n. 

5.  If  n  is  a  goal  node,  exit  successfully  with 
the  solution  obtained  by  tracing  a  path  along 
the  pointers  from  n  to  s  in  G.  (Pointers  are 
established  in  step  7.) 

6.  Expand  node  n,  generating  the  set  M  of  its 
successors  and  install  them  as  successors  of 
n  in  G. 

7.  Establish  a  pointer  to  n  from  those  members  of 
M  that  were  not  already  in  G  (i.e.  not  already 


on  either  OPEN  or  CLOSED) .  Add  these  members 
of  M  to  OPEN.  For  each  member  of  M  that  was 
already  on  OPEN  or  CLOSED,  decide  whether  or 
not  to  redirect  its  pointer  to  n.  For  each 
member  of  M  already  on  CLOSED,  decide  for  each 
of  its  descendands  in  G  whether  or  not  to 
redirect  its  pointer.  (Chapter  V  will  treat 
this  step  in  detail) 

8.  Reorder  the  list  OPEN,  either  according  to 
some  arbitrary  scheme  or  according  to 
heuristic  merit. 

9 .  Go  to  step  3 . 

1.  DISCUSSION 

The  algorithm  is  self-expanatory  except  for  steps  7  and 
8,  which  can  have  several  interpretations  and  variations. 
Step  7  handles  multiple  paths  to  a  single  node  by  first 
recognizing  that  a  newly  generated  successor  has  been 
'seen'  before  (and  hence  has  more  than  one  parent).  This  is 
done  by  comparing  each  successor  with  every  entry  on  OPEN 
and  CLOSED.  A  duplicate  indicates  that  another  parent 
exists  for  this  configuration  and  that  the  new  node  is 
redundant  and  must  be  discarded.  The  rediscovered  node 
keeps  both  parents  as  neighbors  in  the  search  graph,  but 
the  search  tree  forces  the  child  to  'choose'  one  of  the 
two.  The  child  decides  by  pointing  to  the  parent  with  the 
shortest  path  to  the  root.  This  is  called  updating  a  node, 
and  there  are  two  alternative  methods  which  accomplish 
this,  called  Ordered  Search  and  Graph  Search.  These 
variations  are  discussed  in  detail  in  Chapter  V. 


2.  REORDERING  VARIATIONS 

Step  8  refers  to  reordering  the  nodes  on  OPEN.  Nodes 
are  typically  ordered  based  on  their  cost,  or  distance  from 
the  start  node.  Several  alternative  ordering  methods  are 
possible,  and  the  choice  can  drastically  affect  the 
direction  and  efficiency  of  the  resulting  search  pattern. 
Four  variations  are  presented  below,  called  Depth-First, 
Breadth-First,  A*,  and  weighted  A*. 

a.  DEPTH-FIRST  SEARCH 

One  variation  is  called  Depth-First  Search,  which  is 
characterized  by  a  search  pattern  that  proceeds  downward 
along  a  single  path  until  (1)  the  goal  node  is  found,  (2)  a 
node  on  the  existing  path  cannot  be  expanded  further 
(called  a  'terminal'  node),  or  (3)  some  arbitrary  depth 
bound  is  reached.  If  the  goal  is  found,  then  the 
algorithm  succeeds  and  terminates  with  the  solution  path. 
Otherwise,  it  backs  up  one  level,  selects  the  most 
promising  alternative  and  proceeds  downward  again.  The 
OPEN  list  in  this  variation  is  kept  in  descending  order  by 
cost,  where  cost  is  the  depth  of  the  node  from  the  start. 

b.  BREADTH-FIRST  SEARCH 

Another  variation,  called  Breadth-First  Search, 
expands  the  graph  completely  at  each  level  before  advancing 
to  the  next  deeper  level.  The  OPEN  list  in  this  case  is 
maintained  in  ascending  order  by  the  depth  of  the  node  from 
the  start  (same  cost  measure  as  used  in  Depth-First) . 


Breadth-First  guarantees  to  find  a  solution  if  one  exists, 
and  that  the  one  it  finds  will  be  the  shortest  one  in  the 
graph  (this  property  is  referred  to  as  'admissibility'). 
This  good  feature  is  offset  by  the  cost  incurred  in 
exhaustively  enumerating  the  state  space  layer  by  layer  up 
to  the  level  in  which  the  goal  resides. 

C.  A*  SEARCH 

Another  variation,  called  A*,  attempts  to  combine  the 
good  effects  of  both  of  the  above  through  use  of  heuristics 
to  guide  the  direction  of  the  search  pattern  by 
intelligently  ordering  OPEN.  Informally,  a  heuristic  is  a 
'rule  of  thumb',  an  educated  guess,  or  intuition  applied  to 
the  task  at  hand.  Heuristics  provide  a  simple  means  of 
indicating  which  among  several  courses  of  action  is  to  be 
preferred,  but  are  not  guaranteed  to  identify  the  most 
effective  course  of  action.  Obviously,  the  more  accurate 
and  consistent  a  heuristic  is,  the  more  effective  it  is  and 
the  more  efficient  the  resulting  version  of  A*  becomes. 

The  cost  of  a  node  is  computed  by  using  the  following 
formula: 

F(n)  =  G(n)  +  H (n) 

where  n  is  the  node,  G  is  the  distance  of  node  n  from  the 
start  node  (same  measure  as  in  Depth  and  Breadth  First) , 
and  H  is  a  heuristic  estimate  of  the  distance  (cost) 
remaining  to  the  goal.  F(n)  is  then  the  program's  best 


It, 


estimate  of  the  solution  path  length  for  a  path  constrained 
to  pass  through  node  n.  The  OPEN  list  is  kept  in  ascending 
order  on  F. 

The  heuristic  component  guides  the  directions  in  which 
the  search  tree  is  developed,  discouraging  Breadth-First 
expansion  and  permitting  the  program  to  expand  nodes  along 
paths  it  senses  (sometimes  incorrectly)  are  the  way  to  the 
goal.  Without  the  H  component,  A*  reduces  to  Breadth- 
first  search,  and  without  the  G  component,  A*  relies  purely 
on  the  estimating  ability  of  the  heuristic.  This  is  all 
right  if  the  heuristic  is  accurate  or  if  many  paths  exist 
to  the  goal,  but  can  lead  to  long  searches  down  dead-end 
paths  if  H  occasionally  returns  misleading  values.  The  G- 
component  serves  to  remind  the  program  that  the  search  has 
been  (or  is  being)  led  astray. 

In  addition,  if  the  H  component  always  underestimates 
the  actual  distance  remaining  to  reach  the  goal,  the 
property  of  admissibility  is  retained,  and  A*  will  always 
find  the  shortest  path  from  the  starting  state  to  the  goal 
state. 


d.  WEIGHTED  A* 

Pohl  (1970)  modified  the  A*  algorithm  by  using  weights 
to  adjust  the  effect  of  the  two  components: 


F(n)  *  (1-W)  *  G(n)  +  W  *  H (n) 


In  this  function,  G  and  H  mean  the  same  as  they  do  in  A*, 


but  the  percentage  of  their  contribution  to  the  cost 
measure  F  is  controlled  through  the  selection  of  a  weight  W 


between  0  and  1.  Note  that  setting  the  weight  to  zero 
reduces  to  Breadth-first  search,  and  a  weight  of  one 
ignores  the  G  component  completely,  resulting  in  a  purely 
heuristic  search.  Using  a  weight  of  one-half  evenly 
balances  G  and  H,  corresponding  to  the  classical  A* 
algorithm  above. 

Because  the  use  of  weights  provides  such  a  variety  of 
cost-ordering  variations,  we  chose  to  use  Weighted  A*  for 
all  of  the  work  presented  in  this  thesis.  We  will  discuss 
this  algorithm  further  in  Chapter  V,  where  we  explore 
various  updating  mechanisms. 
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III.  BEADS  WORLD 


This  chapter  describes  the  domain  (or  problem  model) 
we  selected  with  which  to  conduct  our  research.  We  will 
relate  our  methodology  to  that  of  other  AI  research,  define 
our  model,  and  show  that  it  is  part  of  a  large  family  of 
related  models. 

A.  NEED  FOR  RESEARCH  MODELS 

"Our  research  strategy  in  studying  complex 
systems  is  to  specify  them  in  detail,  program 
them  for  digital  computers,  and  study  their 
behaviour  empirically  by  running  them  with  a 
number  of  variations  and  under  a  variety  of 
conditions.  This  appears  at  present  the  only 
adequate  means  to  obtain  a  thorough  understanding 
of  their  behaviour."  (Newell,  Shaw,  and  Simon,  1963) 

A  great  deal  of  AI  research  to  date  has  been  conducted 

using  games  or  puzzles  for  the  problem  domain.  We  did  the 

same  in  this  thesis.  We  (AI  researchers)  are  not 

fascinated  with  games  and  puzzles  any  more  than  geneticists 

are  enamored  with  fruit  flies.  Each  simply  provides  an 

experimental  guinea  pig  to  the  researcher  that  is  easy  to 

define  in  detail,  yet  whose  behavior  is  sufficiently  rich 

and  unpredictable  to  simulate  the  complexity  found  in  real- 

life  situations.  Most  real-life  situations  are  too 

irregular  and  complex  to  describe  concisely,  and  in 

sufficient  detail  for  fellow  researchers  to  comprehend, 


much  less  program  for  computer  execution. 

B.  THE  8-PUZZLE 

Gaschnig  (1979)  selected  a  sliding-tiles  problem 
called  the  8-Puzzle  for  part  of  his  dissertation  research 
because  Mit  is  a  simple  yet  non-trivial  case  study  in  which 
to  explore  general  issues  with  rigor,  principally  the  issue 
of  predicting  algorithm  performance."  (Gaschnig,  1979,  Pg 
3)  The  8-puzzle,  a  game  still  sold  in  many  toy  stores, 
consists  of  eight  numbered,  movable  square  tiles  placed  on 
a  3  X  3  matrix,  with  the  ninth  matrix  element  left  blank  or 
unoccupied.  Having  this  empty  cell  in  the  matrix  makes  it 
possible  for  any  orthagonally  adjacent  numbered  tile  to 
move  into  its  place,  allowing  the  configuration  of  numbered 
tiles  to  change  to  over  180,000  different  permutations. 

By  arbitrarily  selecting  one  of  these  permutations  of 
numbered  tiles  as  a  "starting  state",  a  carefully  chosen 
sequence  of  tile  movements  about  the  matrix  will  transform 
this  initial  configuration  into  a  preselected  goal 
permutation.  The  basic  objective  is  not  only  to  maneuver 
the  tiles  so  as  to  reach  the  goal  state,  but  also  to  do  so 
in  as  few  moves  as  possible.  Here  is  an  example  of  part  of 
the  8-Puzzle  state  space: 
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Figure  3.1 

A  Portion  of  the  8-Puzzle  State  Space 


C.  GENERALIZING  SLIDING-TILE  PUZZLES 

The  description  of  the  8-Puzzle  above  is  virtually 
identical  to  the  ones  provided  by  Gaschnig  (1979,  Pp  22-23), 
Pearl  (1984,  Pp  6-7),  and  Nilsson  (1980,  Pp  18-20). 

However,  an  alternate  method  of  characterizing  the  8-Puzzle 
is  to  say  that  it  has  a  connected  ring  of  eight  positions 
surrounding  a  single  center,  and  that  positions  are 
occupied  by  one  of  eight  numbered,  mobile  markers  or 
"beads".  Markers  are  only  permitted  to  move  (l)  between 
adjacent  ring  positions,  and  (2)  between  the  center  and 
every  alternate  ring  position.  Of  course,  the  objective  is 
still  to  rearrange  an  initial  starting  configuration  into  a 
preselected  goal  configuration  in  as  few  moves  as  possible. 
The  figure  below  depicts  three  equivalent  instantiations 
based  on  this  definition. 
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Figure  3.2 

Equivalent  8-Puzzle  Instantiations 

Defining  the  8-Puzzle  in  this  way  relieves  it  of  the 
matrix  constraint  and  makes  its  ultimate  shape  irrelevant. 
We  can  characterize  it  completely  in  terms  of  its  number  of 
positions  and  by  the  set  of  legal  moves  or  transformations 
(defined  by  which  positions  are  inter-connected  to  allow 
bead  movement) .  We  can  carry  the  abstraction  one  step 
further  by  lifting  the  restriction  on  the  number  of 
positions  in  the  perimeter  ring.  One  is  no  longer 
constrained  to  eight  positions  as  in  the  8-Puzzle  and  could 
create  a  different  variation  with  only  five  or  ten 
positions  in  the  outer  ring. 

An  additional  generalization  is  to  lift  the  restriction 
on  every  alternate  position  being  linked  to  the  center  and 
allow  this  to  be  defined  in  whatever  number  and 
configuration  is  desired.  Such  an  extension  enables 
augmenting  the  orthagonal  moves  in  the  8-Puzzle  with 
diagonal  moves,  for  example.  By  using  these  abstractions 
and  varying  the  number  of  positions  and/or  changing  the 
configuration  of  center-perimeter  links,  an  entire  class  of 
8-Puzzle  mutations  can  be  devised,  each  sharing  the  8- 
Puzzle's  objectives  but  displaying  a  variety  of  behavior. 


1.  BEADS  WORLD  DEFINITION 

The  abstraction  provided  above  gives  the  basic 
framework  for  defining  a  class  of  puzzles  similar  to  the  8- 
Puzzle.  We  refer  to  this  class  as  the  'Beads  World*. 
Essentially/  the  Beads  World  is  composed  of  puzzles 
characterized  by  a  set  of  positions  linked  into  a  ring 
situated  around  a  single  center  position.  Numbered  beads 
(or  markers)  occupy  all  but  one  of  these  positions.  The 
number  of  beads  used,  incidentally,  is  what  gives  the 
puzzle  its  name.  The  vacant  or  blank  position  is  necessary 
to  allow  the  beads  room  to  move  about.  Beads  are  permitted 
to  move  from  position  to  position  when  two  conditions 
exist:  (1)  a  path  or  link  has  been  established  between  the 
two  positions,  and  (2)  the  destination  position  is  blank  or 
unoccupied.  Since  by  definition,  all  perimeter  positions 
are  connected  into  a  ring-like  structure,  movement  between 
adjacent  perimeter  positions  is  automatically  allowed.  In 
addition,  any  combination  of  perimeter  positions  may  be 
defined  as  being  linked  to  the  center,  so  long  as  at  least 
one  is.  The  8-Puzzle,  then,  has  9  positions,  8  beads,  and 
every  other  perimeter  position  has  a  path  to  the  center. 

Note  that  it  is  up  to  the  user  to  set  the  number  of 
positions  and  to  establish  which  perimeter  positions  have  a 
path  or  link  to  the  center.  Altering  the  number  of 
positions  affects  the  size  of  the  state  space  of  the 
puzzle.  Changing  the  number  and/or  the  configuration  of 


center— to-perimeter  links  redefines  the  rules  or  operators 
that  create  the  nodes  in  the  state  space,  which  also 
affects  the  shape  and  size  of  the  state  space.  Examples 
of  some  of  the  many  possible  puzzle  permutations  are  shown 
in  Figure  3.3. 


FIGURE  3.3 

Beads  World  Variations 


3-Puzzle  Family 


2.  EXTENDING  THE  GENERALIZATION 
So  far,  this  discussion  has  mainly  focused  on 
generalizing  sliding-tile  (or  beads)  problems,  which  we 
have  illustrated  is  simple  to  do  and  provides  a  wealth  of 
related  family  members  with  which  to  experiment.  Our 
programs  function  with  any  of  the  members  defined  thus  far. 
However,  this  generalization  can  be  extended  even  further 
to  encompass  an  even  wider  class  of  problems. 

We  envision  a  class  of  puzzles  using  'baling  wire'  and 
beads,  where  the  wire  determines  the  paths  that  the  beads 
may  travel.  The  beads  may  or  may  not  be  marked.  Positions 
do  not  need  to  be  connected  in  a  ring,  nor  is  a  single 
center  position  required.  In  fact,  by  creating  a  separate 
ring  structure  centered  within  another  ring  structure,  with 
wires  connecting  the  two,  we  create  a  family  of  problems 
that  encompasses  the  15-Puzzle  and  all  its  relatives. 

It  is  even  possible  to  mimick  the  blocks  world  within 
our  Beads  World  generalization.  The  blocks  world  consists 
of  N  numbered  (or  colored)  cubes  which  may  be  arranged  into 
various  stacks  on  a  table  (Nilsson,  1980,  Pg  152) .  It  is 
often  used  to  illustrate  AI  planning  and  searching 
algorithms.  By  abandoning  the  ring  shape  of  our  Beads 
World  model,  one  could  devise  the  blocks  world  problems 
from  the  Beads  World  definition. 

All  one  needs  to  do  to  define  his  'Beads  World'  puzzle 
is  (1)  determine  a  basic  shape  (ring  around  a  center,  ring 


around  a  ring,  three-dimensional  matrix,  etc),  (2) 
determine  the  number  of  available  positions,  (3)  determine 
the  number  of  beads,  and  (4)  establish  the  paths  the  beads 
will  traverse  (or  how  to  connect  the  wires  up) .  The  full 
extent  of  the  Beads  World  family  of  models  has  not  been 
explored,  and  we  leave  this  as  an  idea  for  further 
development . 

D.  EXAMINATION  OF  THE  6-PUZZLE  FAMILY 

To  illustrate  the  potential  and  versatility  of  the 
puzzles  that  this  model  defines  (and  our  software  tools  can 
manipulate) ,  we  present  the  6-Puzzle  (consisting  of  seven 
positions  and  six  beads)  in  all  of  its  possible  link 
permutations.  Family  members  were  generated  by 
systematically  altering  the  number  and  position  of  links 
from  the  perimeter  slots  to  the  center,  creating  twelve 
unique  non-isomorphic  configurations.  For  each  of  the 
twelve,  we  used  a  common  starting  state  (shown  below)  and 
generated  its  state  space. 
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In  the  pages  that  follow.  Tables  3.1  through  3.12 
highlight  the  results  of  the  twelve  variations,  including  a 
diagram  of  the  puzzle  showing  the  links  used,  a  histogram 
showing  the  relative  shape  of  the  search  tree,  and  figures 
indicating  the  longest  sequence  of  moves  discovered 


(maximum  depth  of  the  tree) ,  the  number  of  possible  states 
reached,  and  the  branching  factor  (average  number  of 
successors  from  a  given  parent) . 
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TABLE  3.1 

6-PUZZLE  CONFIGURATION  1 


Unique  States 
Avg  #  of  Neighbors 
Maximum  Depth 


Nodes  at  each  level: 


1  — 
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4  — 

5  — 

6  — 
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8  — 
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10  — 
11  — 
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14  — 

15  — 

16  — 
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TABLE  3.2 

6-PUZZLE  CONFIGURATION  2 
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6-PUZZLE  CONFIGURATION  3 
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While  it  is  interesting  to  compare  the  relative  shapes 
of  the  trees,  we  found  two  items  most  intriguing  and  worthy 
of  further  investigation.  First  is  the  range  of  tree 
depths,  which  varied  from  15  to  63  levels  (or  moves) .  Also 
of  interest  was  the  size  of  the  state  space  generated  in 
comparison  to  the  total  number  of  combinations  possible. 
Some  configurations  could  not  reach  all  the  possible 
permutations  of  moves.  There  are  7!  or  5040  possible 
unique  states,  and  only  configurations  2,  5,  6,  and  8 
through  12  reached  them  all.  Configurations  3  and  7  only 
reached  2520  states,  or  1/2  of  the  total  possible.  Puzzle 
4  only  managed  840  states,  or  1/6  of  the  number  possible, 
and  puzzle  1  only  created  35  states,  which  is  1/144  of  the 
number  possible.  There  seems  to  be  a  relationship  between 
the  number  of  moves  in  the  shortest  cycle  to  the  number  of 
states  that  puzzle  can  reach,  where  a  cycle  is  the  shortest 
series  of  moves  required  to  go  from  the  center  position  out 
to  the  perimeter  and  return  without  retracing  any  paths 
previously  traversed.  Those  puzzles  reaching  all  5040 
nodes  had  a  minimal  cycle  of  3  moves,  while  the  other 
puzzles  reached  subsets  that  were  inversely  porportional  to 
the  length  of  their  minimal  cycles.  It  would  be 
interesting  to  mathematically  define  this  relationship  to 
enable  the  prediction  of  this  value. 


to  the  center,  whereas  in  configuration  7  (see  Table  3.7) 
they  were.  Therefore,  on  the  basis  of  puzzle  symmetry, 
state  space  decomposition  pattern,  and  link  similarity  to 
the  8-Puzzle,  we  selected  configuration  7  as  the  model  for 
the  empirical  studies  in  the  remainder  of  this  thesis,  and 
all  remaining  references  to  the  6-Puzzle  refer  to  this 
particular  configuration  rather  than  the  general  family. 

Our  work  represents  a  fairly  exhaustive  treatment  of 
only  one  configuration  of  the  6-Puzzle  family,  and  using 
our  tools,  further  research  could  be  continued  into  other 
Beads  World  configurations  to  gather  a  wider  base  of 
empirical  data  on  which  to  examine  the  behavior  of  search 
problems  using  heuristics. 


accused  of  being  biased) .  The  first  method  entailed 
selecting  a  starting  configuration  and  randomly  applying 
the  transformation  operations  N  times,  creating  a  candidate 
goal  configuration.  To  ensure  that  N  represented  the 
shortest  path  between  the  pair,  the  A*  algorithm  was  used 
on  the  candidate  pair  since  A*  guarantees  that  it  will  find 
the  shortest  path  between  start  and  goal  if  such  a  path 
exists.  If  the  resulting  solution  path  discovered  and  N 
were  equal,  the  pair  was  added  to  the  sample.  This  was 
repeated  to  gather  forty  pairs  at  each  N,  which  for  the  8- 
Puzzle  range  from  one  to  thirty.  There  were  three  problems 


with  this  technique:  (1)  it  required  many  executions  of  A*; 
(2)  many  candidate  pairs  were  rejected  because  random 
application  of  the  tranformation  rules  did  not  prevent 
loops  and  detours  from  occurring;  and  (3)  finding  samples 
at  N  greater  than  25  was  like  "finding  the  needle  in  the 
haystack",  and  the  rejection  rate  was  so  high  that  he 
decreased  his  sample  sizes  to  only  eight  entries  at  level 
28  and  none  at  levels  29  and  30. 

The  other  method  involved  selecting  a  start  state  and  a 
random  permutation  of  the  numbers  0  through  9  as  the 
candidate  goal  configuration.  A*  was  used  on  each 
candidate  pair  to  determine  if  the  pair  was  solvable 
(remember  that  the  state  space  is  bifurcated,  and  the  start 
might  be  in  one  component  while  the  goal  is  in  the  other) , 
and  if  so,  what  the  solution  path  length  (N)  was.  The 


problems  he  encountered  with  this  method  included  (1)  many 
A*  executions,  and  (2)  it  was  difficult  to  control  the 
number  of  start/goal  pairs  found  at  a  given  N.  As  in  his 
first  method,  the  sample  size  tapers  off  at  higher  values 
of  N. 

B.  OUR  METHOD 

We  could  have  followed  either  of  the  techniques  used 
by  Gaschnig  in  the  creation  of  his  samples,  but  we  chose 
not  to  for  3  reasons:  (l)  his  methods  were  computationally 
expensive,  (2)  it  was  difficult  to  control  the  size  of  the 
sample  at  each  value  N,  and  (3)  the  6-Puzzle's  smaller 
state  space  gave  us  an  option  Gaschnig  didn't  have  —  we 
could  simply  build  the  entire  state  space  from  a  given 
start  configuration  (using  the  same  process  that  created 
the  tables  in  Chapter  III) ,  and  carefully  select  our  sample 
from  the  resulting  search  tree.  This  tree  shows  not  only 
the  path  from  the  start  to  a  possible  2520  candidate  goal 
states,  but  also  provides  the  actual  distance  between  the 
pair. 

While  Gashnig  chose  a  fixed  number  of  samples  for  each 
number  of  moves  (N)  from  the  goal,  we  gathered  a  varying 
number  of  states  at  each  level  of  the  search  tree  to  form 
our  sample.  This  number  was  comprised  of  a  pre-determined 
minimum  (we  used  the  number  5)  and  an  additional  amount 
representing  proportionately  the  number  of  nodes  at  that 
level  compared  to  the  total  number  of  nodes  in  the  tree. 


This  permitted  the  sample  to  be  'shaped'  as  the  graph 
itself  was,  giving  a  greater  number  of  samples  on  those 
levels  containing  the  greatest  number  of  possibilities.  It 
also  provided  an  absolute  minimum  to  select  at  those  levels 
where  relatively  few  nodes  exist.  Gathering  the  nodes  was 
a  simple  matter  of  building  the  search  tree  and  randomly 
selecting  a  proportional  number  of  'goal'  nodes  from  each 
level.  In  addition  to  outputting  the  start  and  goal 
configurations,  we  also  printed  the  actual  distance  between 
them  since  some  of  the  programs  later  on  needed  this 
information.  This  saved  the  expense  of  recalculating  the 
minimum  distance  again  later. 

Using  a  minimum  of  5  samples  per  level  (assuming  there 
were  at  least  five  to  choose  from) ,  our  program  generated  a 
total  of  198  start/goal  pairs.  The  table  below  summarizes 
the  number  of  goal  node  puzzle  states  taken  from  each  level 
of  the  search  tree.  Note  that  our  sample  represents  198 
out  of  6  million  possible  combinations,  or  a  selection 
ratio  of  1  in  30,000.  Gashnig's  sample  contained  895  of 
60  billion  possible  combinations,  for  a  selection  ratio  of 
1  in  600  million!!  Therefore,  our  sample  is  several  orders 
of  magnitude  more  complete  than  his  was. 


V.  COMPARISON  OF  ORDERED  SEARCH  AND  GRAPH  SEARCH 

The  control  mechanism  used  by  many  programs  to  solve 
searching  problems  in  strongly  connected,  finite  graphs  is 
called  the  A*  Algorithm.  This  procedure  provides  method¬ 
ical,  efficient  means  of  expanding  nodes  in  a  graph  setting 
until  a  goal  is  found  (if  one  exists) .  This  chapter  builds 
upon  the  introduction  provided  in  Chapter  II,  focusing 
primarily  on  the  A*  algorithm  variations  that  deal  with 
nodes  that  are  rediscovered  during  the  search  process. 
Nilsson  (1980)  presented  a  variation  we  called  Graph  Search 
which  is  advertised  as  more  efficient  than  the  prevailing 
method  called  Ordered  Search.  We  first  discuss  the  Ordered 
Search  strategy,  which  seems  to  be  more  commonly  used, 
followed  by  Graph  Search,  and  illustrate  both  with 
examples.  We  then  present  the  results  of  our  empirical 
comparison, 

A.  OVERVIEW 

As  a  review,  in  a  graph  setting,  multiple  paths  can 
exist  to  any  single  puzzle  state.  Granted,  the  primary 
objective  of  the  search  is  to  find  any  path  to  the  goal; 
but  when  several  paths  lead  to  the  same  node,  why  not  weed 
out  the  longer  ones  in  favor  of  the  shortest  and  most 
direct  one?  Both  Ordered  Search  and  Graph  Search  do  this, 


but  their  methods  are  distinct  and  involve  tradeoffs  in  the 
computation  time  and  space  required,  and  in  the  number  of 
nodes  expanded.  Simply  stated,  the  Ordered  Search 
variation  maintains  only  a  search  tree  and  not  the  graph, 
returning  nodes  that  are  rediscovered  but  at  a  lower  cost 
back  onto  OPEN  for  possible  reexpansion  later.  Note  the 
inherent  redundancy  since  the  same  node  can  be  rediscovered 
and  possibly  reexpanded  several  times.  However,  it  does 
save  the  time  and  space  needed  to  keep  a  graph  structure 
current . 

On  the  other  hand,  Graph  Search  maintains  both  the 
search  tree  and  a  sub-graph.  The  tree  shows  the  least-cost 
path  to  the  root  via  parent  pointers  (just  as  in  Ordered 
Search).  The  graph  keeps  track  of  the  'neighborhood',  or 
every  path  to  every  node  discovered  thus  far.  Cheaper 
paths  to  existing  nodes  are  maintained  by  propogating  the 
new  path  information  to  the  neighbors  of  the  affected  node 
(as  kept  by  the  neighbor  pointers  in  the  graph) , 
redirecting  parent  pointers  in  the  search  tree  as  needed. 
Propogating  values  through  the  subgraph  in  this  manner 
eliminates  the  need  to  ever  re-expand  nodes.  This  savings 
is  offset,  however,  by  the  overhead  required  to  maintain 
the  graph. 

"There  is  a  tradeoff  between  the  computational 
cost  of  [maintaining  the  graph  structure]  and 
conputational  cost  of  [re-expanding  rediscovered 
nodes]"  (Nilsson,  1980,  Pg  66) 


"Nilsson's  variation  saves  reexpansion  effort  at 
expense  of  value  propogation  and  pointer 
redirecting  effort..."  (Pearl,  1984,  Pg  49) 

B.  ORDERED  SEARCH  ALGORITHM 

Here  is  the  Ordered  Search  variation  of  the  A* 
algorithm,  adapted  from  the  A*  algorithm  discussed  in 
Chapter  II  of  this  thesis.  Note  the  differences  in  steps  1 
and  7,  which  will  be  discussed  momentarily. 


1.  Create  a  new  search  tree,  T,  consisting 
solely  of  the  start  node,  s.  Put  s  on  a  list 
called  OPEN. 

2.  Create  a  list  called  CLOSED  that  is  initially 
empty . 

3.  If  OPEN  is  empty,  exit  with  failure. 

4.  Select  the  first  node  on  OPEN,  remove  it  from 
OPEN,  and  put  it  on  CLOSED.  Call  this  node  n. 

5.  If  n  is  a  goal  node,  exit  successfully  with  the 
solution  obtained  by  tracing  a  path  along  the 
pointers  from  n  to  s  in  T.  (Pointers  are 
established  in  step  7.) 

6.  Expand  node  n,  generating  the  set  M  of  its 
successors. 

7.  For  each  node  m  in  M  do  the  following: 

(A)  If  m  not  on  OPEN  or  CLOSED  then 

establish  n  as  the  parent  of  m, 
add  m  to  OPEN 

(B)  If  m  on  OPEN 

if  cost  of  new  m  <  old  m 
remove  old  m  from  OPEN, 
discard  old  m, 
make  n  the  parent  of  new  m, 
add  new  m  to  OPEN, 
otherwise 

ignore  new  m 

(C)  If  m  on  CLOSED  then 

if  cost  of  new  m  <  old  m 
remove  old  m  from  CLOSED, 
discard  new  m, 
make  n  the  parent  of  old  m, 


£3 

m 
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adjust  cost  of  old  m, 
add  old  m  to  OPEN 
otherwise 

ignore  new  m. 

8.  Reorder  the  list  OPEN,  either  according 
to  some  arbitrary  scheme  or  according  to 
heuristic  merit. 

9.  Go  to  step  3. 

1.  ORDERED  SEARCH  DISCUSSION 

Ordered  Search  maintains  only  the  search  tree  (and  not 
the  state  space  graph) ,  showing  the  nodes  expanded  and  the 
parentage  of  each  (see  step  7) .  Generally,  the  leaf  nodes 
in  the  search  tree  are  those  on  OPEN  awaiting  expansion, 
while  interior  nodes  correspond  to  those  on  CLOSED.  When  a 
node  is  expanded,  some  of  the  children  generated  will  have 
already  been  discovered  as  children  of  another  node. 
Normally,  a  newly  expanded  node  will  not  have  been  'seen' 
before,  and  is  therefore  not  on  OPEN  or  CLOSED  (step  7a) . 
However,  if  a  similar  node  is  found  on  OPEN  or  CLOSED, 
steps  7b  and  7c  examine  which  of  the  two  nodes  to  keep  and 
which  to  discard  on  the  basis  of  cheapest  cost  (we  will 
discuss  later  precisely  what  is  meant  by  cost.  For  now, 
assume  it  to  refer  to  the  distance  between  a  node  and  the 
root  of  the  search  tree) .  If  the  cost  of  the  new  node  is 
greater,  the  new  node  is  discarded  because  the  path  to  the 
new  node  is  longer  (might  be  a  loop  or  simply  a  similar 
node  just  deeper  in  the  tree) . 

If  the  new  node  has  a  lower  cost  value,  a  shortcut  has 
been  discovered  which  needs  special  handling.  If  the  old 


node  was  on  OPEN,  it  is  simply  discarded  and  replaced  by 
the  new  node  (step  7b) .  If  the  old  node  was  on  CLOSED 
(meaning  it  has  already  been  expanded  and  has  children) ,  it 
is  removed  from  CLOSED,  the  old  cost  is  replaced  with  the 
cheaper  value,  the  old  node  has  his  parent  pointer 
redirected  to  the  new  path,  and  is  then  reinserted  onto 
OPEN.  The  successors  of  the  old  node  have  no  way  of 
knowing  their  parent  has  been  redirected  to  a  new  path,  and 
will  only  discover  this  when  the  parent  is  later  reexpanded 
and  they  are  thus  'recreated'  with  the  new  value. 

2.  ORDERED  SEARCH  EXAMPLE 

Suppose  the  search  process  has  generated  the  search 
tree  shown  in  Figure  5.1.  The  solid  nodes  are  on  CLOSED, 
and  the  other  nodes  are  on  OPEN  at  the  time  the  algorithm 
selects  node  1  for  expansion;  arcs  represent  parent-child 
relationship  between  nodes. 


Figure  5.1 

Search  Tree  (Initially) 


When  node  1  is  expanded,  its  single  successor,  node  2,  is 
generated  (see  Figure  5.2).  But  node  2,  with  parent  node  3 
in  the  search  tree,  had  previously  been  generated,  and  node 
2  is  also  on  CLOSED  with  successor  node  5.  Since  the 
algorithm  now  discovers  a  path  to  node  2  through  node  1 
that  is  less  costly  than  the  previous  path  through  node  3, 
the  parent  of  node  2  in  the  search  tree  is  changed  from 
node  3  to  node  1,  and  node  2  is  removed  from  CLOSED  and 
placed  once  again  on  OPEN. 


Figure  5.2 

Search  Tree  (Intermediate) 


Later,  the  search  algorithm  will  select  node  2  for 
expansion,  generating  node  5,  and  let  us  suppose  node  4 
also.  Node  4  is  already  on  OPEN  with  parent  node  6,  but 
the  cost  through  node  2  is  less,  so  node  4  has  its  cost 
value  adjusted  and  parent  altered  to  node  2.  Node  5  is  also 
on  OPEN  since  it  was  left  there  last  time  node  2  was 
expanded,  but  its  cost  is  less  than  before,  so  its  parent 
is  still  node  2  but  at  a  lower  cost.  The  adjusted  search 
tree  is  shown  in  Figure  5.3. 


Figure  5 . 3 
Search  Tree  (Final) 


cnanges  propogated  to  the  neighborhood  in  the  graph, 
than  simply  reexpanding  all  of  those  nodes.  A  node, 
expanded,  is  never  reconsidered  (never  put  on  OPEN  a< 
Maintaining  the  graph  structure  adds  time  and  space  ( 
to  those  already  incurred  by  maintaining  the  search  ■ 
as  nodes  are  generated,  they  are  not  only  associated 
one  parent,  they  are  also  linked  by  arcs  to  addition; 
parents  or  neighbors  if  they  are  known.  Maintaining 
graph  becomes  more  expensive  as  the  number  of  neighbi 
increases.  Here  is  the  Graph  Search  variation  of  tl 


6.  Expand  node  n,  generating  the  set  M  of  its 
successors  and  install  them  as  successors  of  n  in 
G. 

7.  For  each  node  m  in  M  do  one  of  the  following: 

(A)  If  i  not  on  OPEN  or  CLOSED  then 

establish  n  as  the  parent  of  m, 
add  m  to  OPEN 

(B)  If  m  on  OPEN 

if  cost  of  new  m  <  old  m 

make  n  the  parent  of  old  m, 
adjust  old  m  to  new  cost, 
discard  new  m 
otherwise 

discard  new  m 

(C)  If  m  on  CLOSED  then 

if  cost  of  new  m  <  old  m 

make  n  the  parent  of  old  m, 
adjust  old  m  to  new  cost  and 
propogate  to  old  m  neighbors 
discard  new  m 
otherwise 

discard  new  m. 

8.  Reorder  the  list  OPEN,  either  according  to  some 
arbitrary  scheme  or  according  to  heuristic  merit. 

9.  Go  to  step  3. 


1.  GRAPH  SEARCH  DISCUSSION 

The  difference  between  this  algorithm  and  the  one 
presented  for  Ordered  Search  is  the  value  propogation 
performed  in  step  7c.  For  each  neighbor  of  m  (as  per  the 
graph  structure),  if  the  neighbor's  cost  is  greater  than 
what  it  would  be  going  through  m,  the  neighbor's  parent  is 
altered  pointer  to  point  to  m  and  its  cost  is  adjusted. 

This  change  is  then  propogated  to  m's  neighbors. 

Propogation  is  continued  until  no  neighbors  are  affected  by 
the  new  value. 
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2 .  GRAPH  SEARCH  EXAMPLE 


The  following  example  illustrates  the  Graph  Search 


process,  and  is  adapted  from  Nilsson  (Ppg  64-66) : 


Suppose  a  search  process  has  generated  the  search 


graph  and  search  tree  shown  in  Figure  5.4.  Since  they  are 


superimposed  on  each  other,  the  dark  arrows  along  certain 


arcs  in  this  search  graph  are  the  pointers  that  define 


parents  of  nodes  in  the  search  tree.  The  solid  nodes  are 


on  CLOSED,  and  the  other  nodes  are  on  OPEN  at  the  time  the 


algorithm  selects  node  1  for  expansion. 


3  O 


Figure  5.4 

Search  Graph  and  Tree  (Initially) 


When  node  1  is  expanded,  its  single  successor,  node  2, 


is  generated,  and  installed  as  a  neighbor  of  node  1.  But 


node  2,  with  parent  node  3  in  the  search  tree,  had 


HI 
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previously  been  generated,  and  node  2  is  also  on  CLOSED 
with  successor  nodes  3,  4,  and  5.  Note,  however  that  node 
4's  parent  in  the  search  tree  is  node  6,  because  the 
shortest  path  from  s  to  node  4  in  the  search  graph  is 
through  node  6.  Since  the  algorithm  now  discovers  a  path 
to  node  2  through  node  1  that  is  less  costly  than  the 
previous  path  through  node  3 ,  the  parent  of  node  2  in  the 
search  tree  is  changed  from  node  3  to  node  1.  The  costs  of 
the  paths  to  the  descendants  of  node  2  in  the  search  graph 
(namely,  the  paths  to  nodes  3,4  and  5)  are  recomputed.  The 
costs  for  nodes  4  and  5  are  now  also  lower  than  before, 
with  the  result  that  the  parent  of  node  4  is  changed  from 
node  6  to  node  2.  Node  3  is  left  as  it  was  since  the  path 
through  node  2  is  the  same  cost  as  its  existing  path.  The 
adjusted  search  tree  is  defined  by  the  pointers  on  the  arcs 
of  the  search  graph  of  Figure  5.5. 


Figure  5.5 

Search  Graph  and  Tree  (Final) 


D.  EMPIRICAL  COMPARISON  RESULTS 


We  ran  both  Ordered  Search  and  Graph  Search  on  a 
common  sample  of  198  start-goal  pairs  at  varying  depths, 
using  3  different  heuristics  (the  three  heuristics  are 
described  in  the  next  chapter,  but  are  referred  to  as  Kl, 
K2,  and  K3 )  and  at  7  different  weights,  for  a  total  of  4158 
problem  executions.  CPU  time  for  Ordered  Search  was  15 
hours,  and  for  Graph  Search  was  27  hours.  This  certainly 
confirms  that  the  run-time  cost  of  maintaining  graph 
in  this  setting  is  very  expensive.  The  space  requirements 
were  also  greater  for  Graph  Search  because  additional 
memory  was  required  to  maintain  the  graph  structure. 

Graphical  comparisons  of  the  results  with  respect  to 
the  total  number  of  nodes  expanded  and  the  solution  path 
length  found  for  problems  of  different  depths  are  included 
in  the  pages  to  follow  (Figures  5.6  through  5.17). 

Figures  5.6,  5.7,  and  5.8  show  that  the  re-expansion 
effort  was  so  negligible  (or  nonexistant)  for  weights  less 
than  0.8  that  the  curves  representing  the  number  of  nodes 
expanded  for  Ordered  Search  are  superimposed  directly  over 
their  Graph  Search  counterparts.  At  a  weight  of  0.9,  some 
minor  differences  between  Ordered  and  Graph  search  appear 
(see  Figures  5.9,  5.10,  and  5.11),  and  increase 
dramatically  at  weight  1.0  (Figures  5.12,  5.13,  and  5.14). 
At  weight  0.9,  the  savings  of  using  Graph  Search  over 
Ordered  Search  averages  5%  for  Kl,  8%  for  K2,  and  only  0.9% 
for  K3  (see  Table  5.1).  At  weight  1.0,  the  savings  is  much 


nodes  on  OPEN  in  Graph  Search  is  not  necessarily  the  same 
as  in  Ordered  Search,  the  two  algorithms  are  likely  to 
expand  nodes  in  an  order  different  from  one  another.  And 
since  many  paths  can  lead  to  the  goal,  this  accounts  for 
the  inconsistent  behavior  in  the  path  lengths  found  by  the 
two  algorithms.  Neither  algorithm  was  consistently  better 
or  worse,  they  just  weren't  exactly  the  same  in  every  case 
either. 
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Figure  5.9 

Graph  vs  Ordered  Search 
Heuristic  Kl 
Weight  =  0.9 
XMEAN,  XMAX 


LEGEND 
•  =  optima] 

■  =  breadth  first 
h  =  XMax,K10rd.0.90 
®  =  XMax.KlGrf.0-90 
ffl  =  XMean.KlOrd.0.90 
a  =  XMean.KlGrf.0.90 


'O.I  02  0.3  0.1  OS  0  6  0.7  0  6  0.6  JO 

H — , — I — | — I — , — > — , — I  .  +-  -r - I - | — I - 1 — I — i — I 

5  10  15  20 

Depth  of  Goal  (N) 


.*vV 


Nodes  Expanded  (X) 


Weight 

XMAX 


.  LEGEND 
optima  1 
breadth  first 

XMaiS'fO.SO 

*««n,K3Gr°b  oo 


Expanded 


S3S“V“ 


r.  > 


^ea  Search 


•  -  ^END 
-optima] 

“=xuadlh  first 

8  =  ^axK3Grf’t100 

®  =  K?nf’J  00 

•■»«2jragJ!il.i)00# 


•  a  a 


20 


Path  Length  (L) 


Figure  5.15 
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Figure  5.16 

Graph  vs  Ordered  Search 
Heuristic  K2 
Weight  =  0.9,  1.0 
LMEAN 
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TABLE  5.1 

Savings  using  Graph  Search  over  Ordered  Search 

Weight  =  0.9 

(figures  expressed  in  percentages) 


K1  K2  K3 


Level 

Mean 

Max 

Mean 

Max 

Mean 

Ma: 

1 

0 

0 

0 

0 

0 

0 

2 

0 

0 

0 

0 

0 

0 
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0 

0 

0 

0 

0 

4 

0 

0 

0 

0 

0 

0 

5 

0 

0 

0 

0 

0 

0 

6 

0 

2 

0 

0 

0 

0 

7 

1 

1 

0 

0 

0 

0 

8 

2 

3 

10 

12 

0 

0 

9 

2 

2 

12 

19 

3 

4 

10 

4 

4 

8 

16 

0 

0 

11 

7 

20 

14 

16 

0 

1 

12 

8 

21 

10 

22 

0 

1 

13 

10 

23 

14 

32 

0 

0 

14 

9 

21 

11 

21 

2 

2 

15 

9 

16 

16 

32 

2 

5 

16 

8 

12 

11 

5 

4 

0 

17 

9 

17 

17 

10 

4 

8 

18 

11 

20 

12 

18 

2 

9 

19 

8 

7 

16 

23 

0 

0 
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13 

18 

14 

17 

0 
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E.  A*  ALGORITHM  AMBIGUITIES 

1.  IMPORTANCE  OF  TIE-BREAKING  POLICY 

The  reordering  of  the  OPEN  list  (step  8  of  A*)  seems 
rather  trivial,  but  deserves  some  attention  because  it  can 
vary  performance  results.  Step  7  generates  new  nodes  or 
changes  some  existing  nodes'  F-values,  and  step  8  insures 
that  the  OPEN  list  remains  ordered  since  the  search 
algorithm  chooses  the  first  node  to  expand  next.  Chapter 
III  discussed  the  impact  that  ordering  this  list  can  have 
on  the  resulting  search  patterns. 

The  problem  is  that  step  8  isn't  explicit  about  how  to 
order  nodes  of  equal  value  (referred  to  as  'breaking 
ties ' ) .  Our  method  places  newest  nodes  in  front  of  other 
nodes  of  the  same  value,  maintaining  OPEN  in  ascending 
order  otherwise.  (We  also  incorporate  step  8  into  step  7 
so  that  as  nodes  are  generated,  we  place  them  on  OPEN  in 
ascending  order,  saving  an  expensive  reorganization  of  the 
entire  list  in  a  separate  step.)  This  has  the  effect  of 
encouraging  the  search  deeper  along  the  most  recently 
generated  path. 

When  a  node  is  expanded,  its  successors  are  created 
deterministically  and  therefore  will  always  occur  in  the 
same  sequence.  This  will  not  change  no  matter  how  many 
times  the  node  is  re-expanded.  Where  two  successors  have 
the  identical  F-value,  they  will  always  be  inserted  onto 
OPEN  in  the  same  order  also.  Ordered  Search,  then,  will 


not  change  this  expansion  sequence  even  though  a  node  may 
be  redundantly  reexpanded  several  times.  On  the  other 
hand,  Graph  Search  only  expands  a  node  once  and  propagates 
cheaper  paths  to  the  applicable  successors.  This  update 
process  involves,  for  each  neighbor  of  the  rediscovered 
node  and  their  successors,  recalculating  the  new  cost, 
redirecting  parent  pointers  to  the  new  path,  removing 
altered  nodes  from  OPEN  and  then  reinserting  them  in  their 
new  order. 

While  we  know  the  sequence  that  successors  are 
generated  will  never  vary,  the  order  of  the  neighbor  list 
is  not  predictable  nor  apparent;  nodes  are  added  as  they 
are  discovered,  yet  their  order  ultimately  dictates  an 
order  of  nodes  on  OPEN.  Suppose  two  of  the  descendants  end 
up  with  the  same  value.  The  neighbor  updated  first  will 
end  up  behind  ones  added  later  because  of  the  tie-breaking 
policy. 

It  is  difficult  to  assess  the  impact  of  this  minor 
point  on  the  performance  of  the  two  algorithms.  We  feel 
that  even  though  Graph  Search  expanded  fewer  nodes  than 
Ordered  Search,  Graph  Search  results  could  be  improved  even 
more  by  finding  a  method  of  avoiding  the  additional 
shuffling  that  the  update  procedure  does  to  the  nodes  on 
the  OPEN  list.  Switching  methods  from  stack  (like  ours)  to 
queue  does  not  avoid  the  phenomenon,  but  merely  causes  it 
to  manifest  itself  elsewhere.  Certainly  there  is  room  for 
further  investigation  into  this  topic. 


2.  DEFINING  'COST' 

An  important  issue  with  regard  to  the  Ordered  Search 
and  Graph  Search  algorithms  that  has  not  been  consistently 
dealt  with  in  the  literature  is  the  cost  measure  used  to 
determine  whether  to  reexpand  or  update  a  rediscovered 
node.  Step  7  of  both  algorithms  base  their  decision  to 
redirect  parent  pointers  based  on  a  value  which  until  now 
has  generically  been  referred  to  as  'cost'. 

Nodes  are  ordered  on  OPEN  on  the  basis  of  their  F 
value,  which  is  comprised  of  a  G  component  (distance  from 
root  of  search  tree)  and  an  H  component  (estimate  of 
distance  remaining  to  the  goal) .  The  discriminator  used  in 
step  7  could  be  either  the  G  component  or  the  F  value. 

Since  any  rediscovered  node  will  always  calculate  to  the 
same  H,  the  only  way  to  tell  if  the  path  is  shorter  is  by 
examining  the  G  component.  In  the  unweighted  version  of 
A*,  either  F  or  G  could  be  used  with  no  effect  on  the 
results,  because  if  G  changes,  F  also  changes.  In  the 
Weighted  A*  algorithm  (which  we  used) ,  this  is  still  the 
case  for  all  weights  less  than  1.0.  In  this  one  special 
case,  F  is  based  purely  on  the  heuristic  component  (H) ,  and 
G  is  given  no  weight  at  all.  Therefore,  when  F  is  used  as 
the  discriminator,  each  rediscovered  node  will  always  have 
the  same  F  value  at  weight  1.0,  and  will  be  automatically 
discarded  because  it's  F-value  doesn't  involve  the  G 
component  to  inform  the  program  that  it  is  on  a  shorter 
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path.  This  is  not  the  case,  however,  if  G  is  used  as  the 
discriminator  in  step  7. 

Empirically,  this  means  that  at  W=l.o,  the  Weighted  A* 
algorithm  using  Ordered  Search  with  F  as  the  discriminator 
should  expand  fewer  nodes  than  the  same  version  using  G, 
but  ought  to  find  shorter  paths.  Since  Graph  Search  never 
reexpands  nodes,  the  only  difference  should  be  that  Graph 
Search  using  F  would  find  longer  paths  than  using  G,  but 
nodes  expanded  should  be  the  same. 


a.  RESULTS  OF  USING  F  OR  G  FOR  COST  DISCRIMINATOR 

The  results  for  the  Ordered/Graph  Search  comparison 
presented  so  far  were  generated  using  G  as  the 
discriminator  in  step  7,  not  only  because  it  returned  the 
most  accurate  results,  but  also  because  it  preserved  the 
nature  of  the  A*  algorithm  with  no  special  cases.  However, 
the  next  chapter  compares  the  6-Puzzle  to  Gaschnig's  8- 
Puzzle;  since  he  used  Ordered  Search  with  F  as  the 
discriminator,  in  order  to  provide  a  direct  comparison,  we 
used  the  same  (Ordered  Search  using  F) .  Thus,  we  can  also 
compare  the  results  of  using  F  and  G  as  discriminators 
within  the  Ordered  Search  algorithm. 

Figures  5.18  through  5.26  present  the  results  generated 
using  Ordered  Search  at  various  weights  using  F  versus  G 
as  the  discriminator  (the  lower  graph).  Figures  5.18 
through  5.20  show  that  there  is  no  difference  between  the 
two  discriminators  using  Kl,  K2,  and  K3  at  weights  0.2, 


0.5,  and  0.7  since  the  performance  curves  for  the  version 
using  F  are  superimposed  on  top  of  their  G  version 
counterparts.  However,  at  weight  1.0,  significant 
differences  are  observed,  with  the  G  version  expanding  much 
more  nodes  than  the  F  version  (Figures  5.21,  5.22,  and 
5.23).  The  observed  differences  are  greatest  for  K1  and 
very  small  for  K3 . 

The  lower  curves  in  Figures  5.24  through  5.26  show  that 
the  path  length  discovered  at  weight  0.9  is  identical 
between  the  G  version  and  the  F  version  of  Ordered  Search. 
The  results  observed  at  weights  lower  than  0.9  were  also 
identical  for  both  versions,  but  their  graphs  are  not 
included  to  conserve  space.  Notice  that  the  path  lengths 
at  weight  1.0  were  different.  The  path  length  for  K1  was 
much  higher  for  the  G  version  (Figure  5.24),  but  not 
consistently  better  at  every  N.  The  path  lengths  for  K2 
(Figure  5.25)  were  only  slightly  higher  for  the  G  version, 
but  again,  not  consistently  for  every  N.  The  path  lengths 
for  K3  (Figure  5.26)  reverse  the  trend,  and  show  that  the  F 
version  produced  longer  solution  paths  for  every  N. 

It  was  expected  that  the  two  versions  would  be 
identical  in  the  lengths  of  the  solution  paths  found,  so 
the  variations  observed  above  were  somewhat  surprising. 

The  explanation  for  this  behavior  is  the  same  as  the  reason 
given  for  the  differences  in  path  length  between  Ordered 
Search  and  Graph  Search  in  Chapter  V  section  D.  The  OPEN 


list  of  each  variation  does  not  contain  the  same  number  or 
combination  of  nodes,  and  so  the  resulting  search  patterns 
will  vary  slightly  with  the  possibility  that  different 
paths  to  the  goal  are  discovered.  The  A*  algorithm  only 
guarantees  that  of  the  paths  discovered,  the  shortest  will 


be  reported. 
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It  is  common  in  the  literature  to  take  the  number  of 


obtaining  dynamic  storage  rrom  tne  system  is  mgn 
ph  Search  would  be  favored. 

he  Issue  of  choosing  F  or  G  as  the  discriminator 
recting  parent  pointers,  our  results  show  a 


perrormance  at  weignt  l.o,  or  opting  for  better  performance 
by  using  F  as  the  discriminator  and  introducing  a  special 
case  into  the  A*  search  algorithm. 
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VI.  6 -PUZZLE/ 8 -PUZZLE  COMPARISON 


In  a  previous  chapter,  we  indicated  that  one 
configuration  of  links  in  the  6-Puzzle  family  appeared 
strikingly  similar  to  the  8-Puzzle.  In  this  chapter,  we 
compare  the  two  puzzles  to  each  other  by  empirically 
comparing  the  performance  of  A*  using  the  same  heuristics 
on  both.  Besides  simply  comparing  the  6-Puzzle  to  another 
domain,  establishing  this  similarity  is  important  because 
then  further  experiments  performed  on  the  6-Puzzle  could 
yield  results  which  may  be  considered  more  general,  and 
also  valid  in  the  8-Puzzle  domain.  This  is  especially 
attractive  because  experimentation  on  the  6-Puzzle  is  much 
more  cost-effective. 

Gaschnig  (1979)  did  extensive  research  on  the  8-Puzzle 
using  three  heuristic  functions  and  at  a  variety  of 
weights.  In  effect,  he  held  the  domain  fixed  and  varied  the 
heuristics;  we  did  the  same  using  the  6-Puzzle.  We  moved 
his  three  heuristics  into  the  6-Puzzle  domain  and  conducted 
the  same  series  of  executions  at  a  variety  of  weights  just 
as  Gaschnig  did.  In  effect,  we  not  only  held  the  domain 
fixed  and  varied  the  heuristics,  but  by  comparing  our 
results  with  the  his,  we  are  also  able  to  compare  the 
result  of  holding  the  heuristics  fixed  and  varying  the 
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domain! 


A.  8 -PUZZLE  HEURISTICS 

A  brief  discussion  of  heuristics  and  the  F  function  was 
given  in  Chapter  II.  Their  purpose  is  to  guide  the  search 
process  by  intelligently  ordering  nodes  on  OPEN  so  that  the 
most  promising  nodes  are  expanded  first,  hopefully  leading 
directly  to  the  goal  without  detours  along  the  way. 
Basically,  we  used  heurstics  that  estimated  the  number  of 
moves  remaining  to  reach  the  goal  by  examining  certain 
aspects  of  the  current  puzzle’s  configuration.  These  are 
the  definitions  Gaschnig  gave  for  the  three  heuristics  he 
used  in  his  empirical  studies  with  the  8-Puzzle: 


"K1  =  number  of  tiles  that  occupy  a  board  location 

in  s  different  from  the  location  occupied  by 
that  tile  in  the  goal  node. 

K2  -  the  sum,  over  all  8  tiles  in  s,  of  the 

minimum  number  of  moves  required  to  move  the 
tile  from  its  location  in  s  to  its  desired 
location  in  the  goal  node,  assuming  that  no 
other  tiles  were  blocking  the  way. 

K3  =  K2  +  3  *  SEQ(s) 

where  SEQ(s)  counts  0  if  the  non-central 
squares  in  s  match  those  in  goal  up  to  [one] 
rotation  about  the  board  perimeter,  and 
counts  two  for  each  tile  not  followed  (in 
clockwise  order)  by  the  same  tile  as  in  the 
goal  node." 

K1  is  simply  a  count  of  the  number  of  tiles  'out  of 


place'  in  the  puzzle.  The  number  of  moves  remaining  to  the 
goal  will  never  exceed  the  number  of  tiles  out  of  place,  so 
this  heuristic  always  underestimates  the  distance 


remaining.  It  also  has  an  upper  bound  in  that  no  more  than 
8  tiles  can  be  out  of  place  because  that  is  the  number  of 
tiles  in  the  puzzle. 


K2  (also  known  as  Manhattan  Distance,  or  the  "city- 

i 

I  block”  distance)  counts  the  number  of  positions  each  tile 

is  out  of  place.  This  represents  the  number  of  moves  it 
would  take  each  out-of  place  tile  to  get  into  its  goal 
position  if  tiles  could  move  over  each  other  (which  they 
1  cannot  do  in  real  life) .  This  heuristic  provides  a  more 

realistic  estimate  than  K1  does  and,  like  Kl,  is  an  under- 
estimater,  since  tiles  blocking  the  path  must  be  dealt 
with . 

!  K3  (also  called  the  Enhanced  Manhattan  Distance) 

|  uses  a  combination  of  K2  and  a  measure  called  SEQ.  The 

purpose  of  SEQ  is  to  assess  the  relative  placement  of 

» 

l 

;  perimeter  tiles  with  each  other,  assigning  a  numeric 

!  penalty  for  tiles  not  followed  by  the  proper  "next”  tile, 

r 

J  and  in  essence,  giving  the  estimate  of  the  number  of  moves 

* 

;  required  to  swap  the  order  of  two  inverted  tiles.  Note 

•  that  this  heuristic  can  overestimate  the  actual  distance  to 

the  goal,  distinguishing  it  from  Kl  and  K2. 


supposed  to  be  followed  by  tile  8,  should  this  situation 
count  2  or  not? 

To  ensure  that  our  encoding  of  the  three  heuristics  was 
correct,  especially  in  the  case  of  K3,  we  ran  them  against 
a  variety  of  start/goal  puzzle  states,  and  just  collected 
the  heuristic  estimates  versus  the  actual  distance  to  the 
goal  for  each  node  expanded  enroute  to  the  goal.  This 
allowed  us  to  compare  our  heuristics 1  estimates  to 
Gaschnig's  since  he  used  the  same  technique  to  collect 
values  from  his,  and  reported  these  values  in  his 
dissertation  (see  Figures  6.19a,  6.20a,  and  6.21a). 

We  found  amazing  agreement  with  Gaschnig  for  K1  and  K2 
(see  Figures  6.19b  and  6.20b).  Our  K3  was  grossly 
overestimating,  however.  So  we  modified  K3  to  ignore  any 
bead  comparisons  involving  a  blank  position  but  to  count  2 
for  every  bead  not  immediately  followed  (clockwise)  by  the 
bead  supposed  to  be  there  as  dictated  by  the  goal  state. 

After  running  this  version  of  K3,  the  new  estimates 
compared  more  favorably  to  Gaschnig's,  but  were  still 
somewhat  overstated  (see  Figure  6. 2 la  and  b) .  We  feel  that 
some  of  this  is  due  to  the  factor  3  by  which  SEQ  is 
multiplied.  This  factor,  originally  obtained  by  empirical 
study,  must  capture  something  that  is  unique  to  the  8- 
Puzzle  and  should  probably  be  adjusted  for  the  6-Puzzle. 

We  defend  our  use  of  the  second  implementation  of 
Gaschnig's  K3  by  giving  an  example  where  the  result  of 
counting  the  blank  is  not  consistent  with  the  results 


as  good.  Gaschmg  used  the  Ordered  Search  variation  of  the 
Weighted  A*  algorithm,  with  F  as  the  discriminator  for  node 
reexpansion  (described  in  the  previous  chapter) .  To 
provide  a  fair  comparison  of  results  between  our  work  and 
his,  we  implemented  the  same  variations. 

The  following  pages  contain  graphs  (Figures  6.1 
through  6.21)  representing  the  execution  of  our  heuristics 
upon  a  sample  of  198  start-goal  pairs,  using  weights  of 
0.2,  0.5,  0.6,  0.7,  0.8,  0.9,  and  1.0.  Each  figure  shows 
both  the  6-Puzzle  (lower  graph)  and  the  8-Puzzle  (upper 
graph)  results  to  simplify  their  comparison  as  much  as 
possible.  We  attempted  to  duplicate  his  format  as  closely 
as  possible.  However,  there  will  be  some  slight 
differences  in  axis  scalings  and  symbols.  We  will  not 
attempt  to  summarize  each  of  the  graphs,  but  invite  the 
reader  to  compare  for  himself  the  striking  similarity 
between  the  two  domains,  and  refer  him  to  Gaschnig's 
dissertation  for  a  thorough,  eloquent  description  of  their 
individual  meanings.  While  ’eyeing1  in  graphs  is  not  a 
mathematically  precise  method  of  comparison,  it  does 
suffice  in  this  case.  A  summary  and  conclusion  follows  the 
graphs . 

(Please  refer  to  Appendix  E  for  a  description  of  the 
terms  used  in  the  graphs.) 
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1.  DISCUSSION 


We  found  amazing  agreement  between  the  performances  of 
our  K1  and  K2  heuristics  and  Gaschnig's  K1  and  K2  for  nodes 
expanded  at  every  weight.  Our  K3  was  similar  to  Gaschnig's 
K3 ,  but  is  not  as  close  in  agreement  as  evidenced  when 
compared  to  K2  in  Figures  6.4  and  6.10.  In  both  of  these 
figures,  Gaschnig's  K2  produced  consistently  poorer  results 
than  K3,  but  in  our  implementation,  this  is  not  always  the 
case. 

Another  minor  inconsistency  was  that  the  solution  path 
lengths  that  our  heuristics  discovered  were  slightly  better 
than  Gaschnig's,  although  the  relative  shapes  of  the  curves 
are  roughly  the  same.  We  feel  this  is  because  of  the 
difference  in  size  of  the  state  spaces  of  the  two  domains, 
making  it  inherently  easier  to  find  longer  solution  paths 
in  the  8-Puzzle  than  in  the  6-Puzzle.  However,  Figure  6.14 
shows  that  the  path  length  for  our  K2  and  K3  are 
inter-twined,  while  Gaschnig's  K2  and  K3  were  distinctly 
tiered. 

D.  CONCLUSION 

We  hope  the  reader  was  as  impressed  with  the  similarity 
between  the  two  domains  as  we  were.  The  results  were 
similar  enough  to  conclude  that  changing  the  domain  had 
little  effect  on  these  heuristics,  and  to  say  that  the  6- 
Puzzle  is  a  close  cousin  to  the  8-Puzzle.  It  would  be 
interesting  to  apply  the  experiments  conducted  in  this 


chapter  to  other  Beads  World  variations  to  see  if  the 
observed  similarity  is  shared  on  a  broader  scale  than  just 
between  the  6-Puzzle  and  the  8-Puzzle. 


^Lv 

RICC  VAX-11/780,  but  again  with  one  exception,  are  not 
necessarily  dependent  on  the  VAX/VMS  environment.  These 
tools  are  generalized  —  in  other  words,  with  the  proper 
configuration  of  a  few  global  domain-description  variables, 
these  tools  automatically  reconfigure  themselves  to  work 
with  any  of  the  Beads  World  configurations.  No  recoding  is 
necessary  —  all  reconfiguration  may  be  accomplished  by 
changing  input  data  to  the  application  programs. 

A  fair  amount  of  attention  was  given  to  applying  sound 
software  engineering  principles  in  the  implementation  of 
these  tools.  As  currently  implemented  the  software  is 
comprised  of  well  over  3000  lines  of  Pascal  and  FORTRAN 
code.  In  order  to  reduce  program  source  modules  to  a 
manageable  size,  and  to  avoid  the  duplication  of  code 
across  many  similar  but  different  applications,  the  module 
facility  provided  by  VAX  Pascal  has  been  used  extensively. 
These  modules  are  somewhat  separable  —  the  user  only  need 
link  with  those  modules  which  contain  data  structures  or 
procedures  which  must  be  imported  for  the  particular 
application.  (As  it  turns  out,  these  modules,  although 
cleanly,  functionally  separated,  are  so  tightly 
interrelated  in  their  overall  operation  that  they  are 
almost  always  all  necessary.)  Inside  these  modules,  data 
structures  and  procedures  have  been  packaged  in  their 
cleanest  possible  form.  This,  combined  with  the  logical 
organization  imposed  by  the  module  structures,  makes  the 
program  code  almost  self-documenting,  allowing  convenient 


maintenance  and  enhancement  of  this  package  by  future 
users. 

From  the  AI  researcher's  perspective,  the  tools  in 
this  package  can  be  divided  into  three  different  functional 
categories.  Some  of  the  tools  are  involved  with 
investigating  the  graph  characteristics  of  various  Beads 
World  configurations:  expansion  of  a  complete  graph, 
analysis  of  its  characteristics,  enumeration  of  its 
elements,  and  the  generation  of  a  representative  sample  of 
its  nodes.  Most  of  the  tools  are  involved  with 
investigating  A*  heuristic  search,  providing  flexibility  as 
to  which  control  structures  and  heuristics  are  used,  as 
well  as  in  the  type  of  data  that  is  gathered.  A  third  and 
somewhat  separate  set  of  tools  is  involved  with  displaying 
the  data  generated  by  the  routines  of  the  second  category 
in  a  convenient  and  meaningful  graphic  form. 

From  the  program  organization  perspective,  this 
package  can  again  be  divided  into  three  functional 
categories.  The  first  consists  of  several  modules  which 
provide  general  utility  functions,  control  structures, 
heuristics,  and  statistical  aids  which  can  be  used  by  a 
variety  of  applications.  The  second  category  consists  of  a 
set  of  applications  modules  for  investigating  search 
spaces,  profiling  heuristics,  and  solving  Beads  World 
puzzles  to  collect  performance  data  on  heuristic  search 
techniques.  The  third  module  is  again  separate  from  the 


other  two,  and  is  responsible  for  the  graphic  output  of 
data. 


i  The  following  sections  describe  the  various  utility 

|  modules  and  applications  programs,  both  in  terms  of  their 

structure  and  operation  (where  appropriate  for 
understanding  the  tools  and  their  behavior) ,  and  in  terms 
of  their  use  by  future  researchers  who  wish  to  use  the 
applications  provided  or  to  create  their  own  applications. 
The  source  code  is  listed  in  Appendices  A  through  C  and  is 
also  included  on  the  resource  diskette  provided  with  this 
document,  as  detailed  in  Appendix  D. 

B.  DATA  STRUCTURES  —  PUZZLE  AND  GRAPH  REPRESENTATION 

Before  describing  the  structure  and  function  of  the 
various  utility  procedures  and  programs  provided  and  the 
use  of  these  in  other  applications  programs,  it  will  first 
be  useful  to  describe  the  fundamental  data  structures  used 
throughout  the  package.  These  represent  unique  Beads 
World  puzzle  states,  the  graph  which  represents  all  of  the 
possible  states  in  a  Beads  World  configuraton  and  their 
relationships,  and  the  search  tree  which  is  created  during 
the  performance  of  a  search  between  two  states  of  this 
graph. 

Figure  7.1  illustrates  the  structure  of  the  most 
fundamental  unit,  the  puzzle  node  descriptor.  The  first 
field  describes  the  state  of  the  puzzle,  or  the  particular 
locations  of  each  of  the  tiles  or  beads  and  the  blank  or 


Figure  7.1  Puzzle  node  record  structure. 
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blanks.  Tile  positions  are  always  numbered,  with  1 
identifying  the  center  position  and  2  through  n 
identifying,  in  order,  clockwise  positions  about  the  center 
of  the  puzzle.  Tiles  are  also  numbered,  with  0  indicating 
a  blank  and  1  through  n-1  uniquely  indicating  each  of  the 
beads.  STATE  is  a  packed  character  array  with  space  for 
puzzles  with  up  to  10  positions.  Each  element  of  the  array 
corresponds  to  a  location,  and  the  character  content  of 
each  element  specifies  the  tile  located  at  that  position. 
The  use  of  a  packed  array  for  describing  puzzle  states 
saves  space  and  makes  state  comparisons  convenient. 

The  next  four  fields  of  the  puzzle  node  record  are  the 
links  used  for  list  and  tree  maintenance  by  the  various 
graph  production  and  puzzle  solution  routines.  LEFT  and 
RIGHT  are  used  to  maintain  doubly-linked  lists  of  puzzle 
nodes;  SORT  LEFT  and  SORT  RIGHT  are  used  to  maintain 


binary  trees  of  puzzle  nodes. 

The  next  field,  NEIGHBORS,  is  a  pointer  to  a  list  of 
neighbor  pointers  —  this  is  the  structure  which  links 
puzzle  nodes  into  a  graph.  Neighbor  nodes  (illustrated  in 
Figure  7.2(a))  are  simply  elements  of  a  singly-linked  list 
of  pointers  to  other  puzzle  nodes.  A  node's  neighbor  list 
is  then  a  list  of  all  nodes  which  may  be  obtained  by 
performing  a  single  state  transformation  operation  on  the 
state  of  that  node.  Neighbor  lists  are  used  during 
creation  of  the  graph  associated  with  a  Beads  World 
configuration,  and  also  by  the  graph  search  version  of  A* 
used  for  solving  Beads  World  problems. 

The  PARENT  pointer  is  used  to  construct  a  search  tree 
as  the  A*  algorithm  explores  a  subset  of  a  Beads  World 
graph.  Figure  7.2(b)  illustrates  the  search  tree 
structure.  Note  that  although  once  fully  constructed, 
neighbor  lists  do  not  change,  a  node's  parent  pointer  may 
change  several  times  as  alternate  solution  paths  are 
explored.  Figure  7.2(c)  illustrates  the  complete  graph  for 
the  "3-puzzle"  and  the  start  and  goal  states  for  the  search 
tree  of  Figure  7 . 2 (b) . 

The  final  three  fields  of  the  puzzle  node  description 
record  are  used  primarily  during  the  A*  search  procedure. 
G_VALUE  contains  that  node's  current  distance  from  the 
starting  node.  H_VALUE  contains  the  current  heuristic's 
estimate  of  that  node's  distance  from  some  goal  node  in  the 
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C.  GENERALIZED  TOOLS  MODULES 


1.  UTILITIES 

The  utilities  module  packages  a  variety  of  functions 
and  procedures  for  manipulating  the  data  structures 
described  in  the  previous  section.  Among  these  are 
routines  handling  dynamic  allocation  and  deallocation, 
linked  list  manipulation,  tree  manipulation,  and  the  input 
and  output  of  puzzle  state  descriptions.  The  source  code 
for  this  module  is  listed  in  Appendix  A.  Each  of  the 
routines  is  described  below. 

There  are  two  procedures  which  provide  a  convenient 
way  to  read  and  write  puzzle  state  descriptions.  Procedure 
READ_STATE  reads  from  the  standard  input  a  puzzle 
description  for  a  puzzle  of  size  n  into  a  packed  puzzle 
state  array.  The  format  for  this  puzzle  description  is 
shown  in  Figure  7.3.  PRINT^STATE  accepts  an  n-puzzle  state 
description  and  writes  it  to  the  standard  output  in  the 
same  format. 

Figure  7.3  Puzzle  description  for  a  state  of  the  5-puzzle. 

bead  numbers  (340215) 

[  positions  123456] 


CREATE_PUZZLE_NODE  dynamically  allocates  a  new  puzzle 
descriptor  node  with  the  desired  state  and  initializes  all 
of  the  other  fields  to  NIL  or  0  as  appropriate.  The 
complementary  procedure  FREE_NODE  deallocates  puzzle  nodes, 
but  in  addition,  systematically  frees  the  neighbor  node 
elements  of  that  node's  neighbor  list. 

Linked  lists  of  puzzle  nodes  are  used  throughout  the 
various  modules  which  make  up  this  package.  All  of  these 
lists  are  doubly- linked,  and  have  a  header  node  whose 
G_VALUE  is  a  count  of  the  number  of  elements  on  the  list. 
Lists  in  the  program  code  are  then  simply  pointers  to  the 
header  nodes  of  these  doubly-linked  lists.  Several 
functions  and  procedures  are  provided  for  manipulating 
these  lists.  CREATE_EMPTY_LIST  returns  a  pointer  to  a 
header  node  with  no  list  elements.  IS_EMPTY  is  a  boolean 
function  which  returns  a  TRUE  value  if  the  LIST  has  no 
nodes  other  than  the  header.  PLACE_ON_END_OF_LIST  adds  a 
node  to  a  list  in  a  queue-like  manner. 

PLACE_IN_ASCENDING_ORDER  is  a  special  procedure  (used 
by  A*  in  ordering  OPEN  lists)  which  places  a  node  on  a  list 
by  ascending  F  value.  This  routine  resolves  ties  in  the  F 
value  by  placing  the  most  recent  node  with  some  F  value 
before  all  of  the  other  nodes  with  that  same  value.  As  was 
discussed  in  a  previous  section,  this  can  have  an  impact  on 
the  performance  of  A*;  the  tie-resolution  convention  can 
be  easily  changed  by  a  small  change  inside  this  routine. 
REMOVE  FROM  FRONT_OF_LIST  performs  the  expected  operation, 
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returning  a  pointer  to  the  first  node  on  the  list  and 
removing  it  from  the  list.  DELETE_FROM_LIST  removes  the 
specified  node  from  anywhere  in  the  list.  Finally, 
FREE_LIST  deallocates  all  of  the  nodes  on  LIST  and  then 
disposes  of  the  header. 

Our  implementations  of  A*  use  binary  trees  to  keep 
track  of  nodes  already  generated.  Three  routines  are 
provided  which  allow  the  use  of  the  binary  tree  structure. 
INSERT_IN_TREE  inserts  the  node  being  pointed  to  in  a  tree 
in  "inorder"  fashion,  keyed  by  the  alphabetic  order  of  the 
state  representations  held  in  the  node  and  the  nodes 
already  on  TREE.  INSERT_IN_TREE  does  not  attempt  to 
balance  the  tree. 


FIND_IN_TREE  and  FIND_STATE_IN_TREE  perform 
essentially  the  same  function.  Given  a  pointer  to  a  puzzle 
node,  FIND_IN_TREE  tries  to  find  and  return  a  pointer  to 
the  node  in  the  tree  having  the  same  puzzle  state.  If  no 
such  node  is  found,  FIND_IN_TREE  returns  NIL. 
FIND_STATE_IN_TREE  returns  a  pointer  to  the  node  in  TREE 
having  the  desired  puzzle  state. 

Finally,  two  procedures  are  provided  to  allow  the 
deallocation  of  the  search  tree  and  graph  structures  that 
are  created  out  of  puzzle  nodes  and  their  neighbor  lists. 
FREE_BINARY_TREE  recursively  deallocates  the  nodes  in  a 
binary  tree.  FREE_GRAPH  recursively  disposes  of  all  of  the 
nodes  and  neighbor  lists  which  make  up  a  Beads  World  graph. 


2.  CONTROL  STRUCTURES 

This  module  contains  the  routines  which  perform  graph 
space  generation  and  which  implement  the  two  versions  of 
the  A*  search  alogorithm  discussed  in  Section  5.  In 
addition  this  module  exports  data  structures  which  describe 
the  characteristics  of  a  graph  space  and  the  results  of  a 
search  (and  the  routines  which  initialize  these) .  Most  of 
the  routines  provided  by  the  utilities  module  are  imported 
by  the  control  module,  as  well  as  the  highest  level  routine 
from  the  heuristics  module  described  below. 

Two  routines  are  primarily  involved  with  the 
generation  of  Beads  World  graphs.  GENERATE_GRAPH  accepts  a 
starting  puzzle  state  and  generates  a  complete  graph  from 
this,  returning  three  items.  The  first  is  a  pointer  to  the 
inorder  binary  tree  containing  all  of  the  puzzle  nodes, 
ordered  alphabetically  by  state  descriptor.  The  second  is 
a  pointer  to  the  generated  graph  structure.  (Note  that 
these  pointers  always  point  to  the  same  node  —  the  one 
which  contains  the  starting  state.)  The  third  is  a  graph 
descriptor  record,  whose  structure  is  described  below. 

Although  there  is  no  start  or  end  to  the  graph  space 
associated  with  a  particular  Beads  World  configuration,  as 
a  practical  consideration,  the  generation  of  this  graph 
space  has  to  start  somewhere.  The  algorithm  which 
GENERATE_GRAPH  uses  is  similar  to  the  basic  shell  of  the  A* 
algorithm,  and  is  given  in  Figure  7.4.  Unexplored 
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Figure  7.4  Graph  generation  algorithm. 
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(unexpanded)  graph  nodes  are  maintained  on  an  OPEN  list, 
which  is  ordered  by  increasing  "depth”  in  the  graph 
(distance  from  the  starting  node) .  Nodes  to  be  expanded 
are  removed  from  the  front  of  the  list,  which  means  that 
the  graph  generation  proceeds  in  a  breadth-first  manner. 
When  each  node  is  removed  for  expansion,  all  of  its 
successors  are  generated.  Each  successor  is  in  turn 
examined.  It  is  first  assigned  a  G  value  (depth)  one 
greater  than  its  parent's;  then  the  search  tree  is  examined 
to  see  if  this  is  a  new  node  or  a  previously  discovered 
one.  If  it  is  new,  then  it  is  added  to  the  parent’s 
neighbor  list,  to  the  search  tree,  and  to  OPEN.  If  it  is  a 
previously  discovered  node,  then  the  previous  one  is  added 
to  the  parent's  neighbor  list.  Eventually,  all  nodes  will 
have  been  previously  discovered,  leaving  OPEN  empty.  The 
algorithm  then  terminates,  leaving  a  graph  structure  such 


as  the  one  in  Figure  7.2(c),  "rooted"  at  the  starting  node. 
The  most  interesting  feature  of  this  structure  is  that  all 
of  the  nodes  contain  in  G_VALUE  their  distance  from  the 
starting  node.  This  is  what  is  meant  by  a  graph  "rooted" 
at  the  start,  and  is  an  extremely  useful  feature,  as  will 
be  seen  in  the  following  sections. 

In  addition  to  building  a  graph  rooted  at  START_STATE, 
GENERATE_GRAPH  fills  in  a  corresponding  data  structure 
which  describes  the  features  of  the  particular  graph  in 
question.  This  structure  is  a  graph  descriptor,  and  is 
shown  in  Figure  7.5.  The  DEPTH  field  tells  the  maximum 
distance  of  any  node  from  the  starting  node.  GENERATED 
tells  how  many  nodes  were  created  in  the  expansion  of  the 
graph.  EXPANDED  tells  how  many  nodes  were  actually 
explored,  and  is  thus  a  count  of  how  many  states  there  are 
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LEVEL  is  an  array  of  records,  each  describing  a 
certain  "level”  or  group  of  nodes  at  the  same  depth  in  the 
graph.  Each  level  record  contains  a  count  of  the  number  of 
nodes  at  that  level,  and  also  contains  a  pointer  to  a 
doubly-linked  list  of  these  nodes.  Space  is  provided  in 
the  array  for  up  to  100  level  records. 

The  procedure  INITIALIZ E_GRAPH_DE S CRI PTOR  sets  all  of 
the  various  counting  fields  to  zero,  and  creates  an  empty 
level  list  for  each  of  the  levels  in  the  level  array.  As 
GENERATE_GRAPH  constructs  a  graph  it  keeps  track  of  the 
number  of  nodes  generated  and  expanded,  and  also  the 
maximum  depth.  As  each  node  in  the  graph  is  expanded,  it 
is  placed  on  the  appropriate  level  list.  The  structures 
resulting  from  a  call  on  GENERATE_GRAPH  for  a  version  of 
the  "3-puzzle"  are  shown  in  Figure  7.6. 

Before  moving  on  to  a  discussion  of  the  A*  control 
structure  implemented  in  this  module,  a  description  of  the 
method  of  successor  generation  is  appropriate.  A  general 
support  procedure  called  GENERATE_SUCCESSORS  accepts  as 
input  a  puzzle  state  descriptor,  and  returns  a  list  of 
newly  created  descendant  puzzle  nodes  containing  all  of  the 
states  that  may  be  obtained  by  one  legal  transformation  on 
the  input  state.  As  described  previously  a  legal 
transformation  is  defined  as  moving  a  bead  to  an  adjacent, 
blank  position  that  is  connected  by  an  arc  or  a  link. 
GENERATE_SUCCESSORS  is  completely  general  to  all  Beads 
World  configurations,  and  is  used  by  the  control  structures 
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Figure  7.6 


Data  structures  created  by  a  call  on 
GENERATE_GRAPH  for  the  3 -puzzle.  The  binary 
search  tree  is  shown  separately  for  clarity, 
but  is  actually  superimposed  on  the  graph. 
Neighbor  lists  have  been  omitted  for  clarity. 
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implemented  in  this  module.  The  three  global  Beads  World 
configuration  variables  are  contained  in  this  module  as 
static  variables.  NUM_POSITIONS  tells  how  many  positions 
there  are  in  the  puzzle.  NUM_LINKS  is  an  integer  which 
tells  how  many  outer  positions  are  connected  to  the  center 
with  links.  LINK  is  a  boolean  array  with  elements 
corresponding  to  each  of  the  outer  positions.  If  a  link 
exists  between  a  position  and  the  center,  then  the 
corresponding  boolean  in  LINK  is  TRUE. 

The  two  versions  of  A*  described  in  Section  5  above, 
ordered  search  and  graph  search,  are  both  implemented  in 
this  module.  ORDERE D_S E ARCH  accepts  as  inputs  a  START  and 
GOAL  state,  a  HEURISTIC  selector,  and  a  WEIGHT,  and  returns 
a  RESULTS  description  in  the  form  of  a  results  descriptor 
record.  GRAPH_SEARCH  has  the  same  arguments.  The  basic  A* 
algorithm  and  both  the  ordered  search  and  graph  search 
versions  of  it  have  already  been  described  in  general  terms 
in  previous  sections  of  this  document.  Highlights  of  the 
implementations  of  these  are  described  below. 

Rather  than  maintain  two  distinct  node  lists,  OPEN  and 
CLOSED,  these  implementations  maintain  an  OPEN  list  and  a 
binary  search  tree.  OPEN  contains  only  those  nodes  which 
have  been  generated  but  which  remain  to  be  expanded,  and  is 
ordered  by  increasing  F  value.  The  search  tree  is  a  binary 
tree  on  which  all  generated  nodes  are  placed,  in  order, 
alphabetically  by  their  character  state  descriptions.  With 


these  structures,  a  binary  tree  search  is  performed  to  see 
if  a  node  has  been  previously  discovered.  A  node  is 
defined  to  be  CLOSED  if  it  is  in  the  tree  but  not  on  OPEN. 

ORDERE D_S E ARCH  does  not  maintain  a  graph  structure 
among  the  nodes  visited  during  the  search  —  it  only 
maintains  an  implicit  search  tree  of  parents  and 
successors.  Thus,  the  neighbor  list  fields  are  not  used 
and  no  neighbor  lists  are  kept.  When  nodes  are 
rediscovered  on  shorter  paths,  they  are  simply  reexpanded. 
Backwards  chaining  parent  pointers  maintain  the  implicit 
search  tree. 

GRAPH_S E ARCH  also  maintains  an  implicit  search  tree  by 
keeping  parent  pointers  among  all  successor  nodes. 

However,  this  tree  structure  is  superimposed  on  a  graph 
structure  which  represents  the  subset  of  the  complete  Beads 
World  graph  that  has  been  explored  to  that  point  in  the 
search.  As  each  node  is  generated,  the  binary  search  tree 
is  searched  for  its  state.  If  it  is  a  new  node,  it  is 
added  to  its  parent’s  neighbor  list,  its  parent  pointer  is 
directed  to  the  parent  node,  its  F  value  is  calculated,  and 
it  is  placed  on  OPEN  and  the  binary  search  tree.  If  it  is 
a  rediscovered  node  on  a  shorter  path,  then  the  new  node  is 
discarded,  and  the  old  one  is  updated  with  the  new  path 
information.  If  this  node  was  CLOSED,  the  results  of  the 
change  in  path  information  are  propagated  throughout  the 
sub-graph  by  a  recursive  update  procedure.  If  this  node 
was  OPEN,  it  is  replaced  on  OPEN  at  the  proper  location  for 


its  new  F  value. 

In  order  to  provide  a  cleaner  packaging  of  these 
routines,  all  of  the  information  about  a  search  run  is 
packaged  in  a  record  structure  called  a  RESULTS_DESCRIPTOR, 
which  is  passed  as  a  VAR  parameter  and  is  filled  in  by  the 
search  routines.  The  RESULTS  record  contains  a  boolean, 
SOLVED,  which  indicates  whether  a  solution  path  was  found 
from  the  starting  state  to  the  goal  state.  It  also  has 
fields  which  hold  the  PATH_LENGTH,  the  number  of  nodes 
GENERATED,  and  the  number  of  nodes  EXPANDED.  The  minimum 
path  length,  MIN_PATH_LENGTH ,  is  filled  in  by  the  calling 
routine,  as  it  is  usually  provided  with  the  start  and  goal 
states.  The  HEURISTIC  and  WEIGHT  fields  indicate  which 
heuristic  and  weight  were  used  in  the  solution  of  a 
particular  puzzle.  Finally,  two  pointer  fields,  START  and 
GOAL,  point  to  the  starting  node  and  the  last  node  (goal 
node  in  the  case  of  a  successful  solution)  on  the  path. 
INITIALIZE_RESULTS  simply  clears  the  results  descriptor.  A 
utility  function,  PRINT_PUZZLE_SOLUTION,  is  provided;  this 
routine  accepts  a  results  description  record  and  prints  the 
puzzle  states  contained  in  the  nodes  on  the  path  from  the 
start  to  the  goal. 

3.  HEURISTICS 

This  module  contains  all  of  the  functions  used  to 
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calculate  heuristic  estimates  of  distance  to  the  goal,  and 
is  intended  as  the  module  which  future  users  will  alter 


most  frequently  to  suit  their  particular  research  needs. 

It  contains  routines  which  fall  into  three  categories: 
those  associated  with  abstracting  the  three  traditional  "8- 
puzzle"  heuristics  to  the  general  Beads  World,  those  which 
provide  the  ability  to  generate  and  use  heuristic  profiles, 
and  those  which  provide  other  modules  with  access  to  these 
functions.  Each  of  these  categories  are  described  below. 
Alteration  and  use  of  the  heuristic  module  is  described  in 
a  later  section. 

The  basic  heuristic  module  contains  three  routines 
which  abstract  the  three  traditional  "8-puzzle"  heuristics 
described  in  Section  6.  TILES_MISPLACED  accepts  the 
current  state  and  the  goal  state  as  inputs  and  returns  a 
count  of  the  number  of  tiles  (beads)  which  are  not  in  the 
same  positions.  MANHATTAN_DISTANCE  abstracts  the  idea  of 
shortest  "city-block”  distance  into  the  idea  of  the 
smallest  number  of  moves  to  put  beads  in  their  proper 
positions,  assuming  that  there  are  no  beads  in  the  way 
requiring  movement.  ENHANCED_MANHATTAN_DISTANCE 
corresponds  directly  to  the  third  heuristic,  and  counts  the 
score  of  nodes  out  of  sequence  about  the  perimeter  of  the 
puzzle.  These  routines  are,  as  with  the 
GENERATE_SUCCESSORS  procedure,  completely  generalized  to 
all  Beads  World  configurations,  and  are  implemented  with 
the  support  of  three  routines  which  calculate  the  minimum 
distances  between  states  using  moves  constrained  to  go 


through  center  or  perimeter  positions. 


An  important  and  useful  tool  for  analyzing  the 
behavior  of  heuristic  functions  and  for  simulating  this 
behavior  at  varying  levels  of  abstraction  is  the  ability  to 
profile  heuristic  functions.  This  involves  recording,  for 
each  call  on  a  particular  heuristic  function  (h) ,  the  true 
remaining  distance  to  the  goal  (n)  and  that  heuristic's 
estimate  of  the  remaining  distance  (k) .  The  resulting  data 
is,  for  each  heuristic,  a  set  of  tuples  representing  the 
(n,  k)  combinations  encountered  and  the  frequency  of  each 
combination.  The  heuristics  module  automatically  collects 
this  data  in  a  large,  three-dimensional,  integer  array  with 
indices  h,  n,  and  k.  This  array  is  initialized  to  be  all 
zero  when  the  heuristics  module  is  initialized.  When  all 
of  the  heuristic  estimates  have  been  performed,  this  data 
may  then  be  written  out  to  a  special  profile  output  file 
(in  a  format  described  later)  by  a  call  to  the  procedure 
PRINT_PROFILES,  which  expects  a  file  name  string  as  input. 

Profiles  are  also  useful  as  input  to  heuristic 
functions  which  attempt  to  simulate  or  model  the  behavior 
of  some  actual  heuristics.  Profile  data  such  as  that 
generated  by  previous  search  runs  in  the  manner  described 
above,  or  data  contrived  by  the  researcher,  may  be  read  in 
from  auxiliary  input  files  and  stored  in  special  record 
structures  by  use  of  the  READ_PROFILES  procedure. 
READ_PROFILES  accepts  a  30  character  file  name  string  as 
input,  and  reads  the  profile  data  from  this  file  into  a 


profile  database  which  is  maintained  by  the  heuristics 
module.  The  structure  of  this  is  shown  in  Figure  7.7. 
PROFILE  is  a  static  array  of  profile  pointers,  one 
corresponding  to  each  of  the  heuristics  implemented  in  the 
module.  Each  of  these  pointers  points  to  a  PROFILE_RECORD. 
Profile  records  contain  a  10  character  NAME  field,  an 
integer  HEURISTIC  identification  number,  and  six  arrays. 

The  first  three,  MIN,  MAX,  and  COUNT,  are  arrays  of 
integers,  and  record  the  minimum  and  maximum  estimates  at 
each  level  n,  as  well  as  the  total  number  of  estimates  at 
each  n.  MEAN  is  a  real  array  which  records  the  mean 
estimate  at  each  n.  STDEV  records  the  standard  deviation 
about  the  mean  at  each  n.  The  final  array,  HISTOGRAM,  is  a 
two-dimensional  integer  array  which  records  the  frequency 
of  occurance  of  each  (n,  k)  pair.  These  fields  are 
calculated  and  filled  in  by  READ_PROFILES  as  it  processes 
the  input  file  data.  Only  profiles  for  a  few,  particular 
heuristics  are  used  at  any  one  time.  To  save  space, 
profile  records  are  dynamically  allocated  and  initialized 
by  CREATE_PROFILE ,  which  returns  a  pointer  to  an  empty 
profile  record.  After  being  filled  in,  these  profile 
records  are  entered  in  the  PROFILE  array.  The  data  in 
these  records  is  then  available  to  applications  by 
subscripting  the  PROFILE  array  with  the  heuristic 
identification  number. 
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the  goal;  this  information  is  used  only  for  worst  case 
simulation  and  modeling.  ESTIMATED_DISTANCE  returns  a 
natural  number  which  gives  the  selected  heuristic's 
estimate  of  the  distance  between  the  state  of  CURRENT  and 
the  GOAL.  ESTIMATED_DISTANCE  performs  two  functions.  The 
first  is  simply  to  use  the  heuristic  selector  H  to  invoke 
the  proper  function  from  a  CASE  statement.  The  second 
function  is  more  involved.  For  several  heuristic 
functions,  particularly  those  concerned  with  simulation  and 
modeling  of  heuristics,  it  is  necessary  to  know  the  true 
remaining  distance  to  the  goal,  n.  (This  is  of  course  also 
necessary  for  the  collection  of  the  profile  data.)  In 
order  to  obtain  this  value,  ESTIMATED_DISTANCE  invokes  the 
GENERATE__GRAPH  procedure  from  the  control  module  to 
generate  a  graph  rooted  at  the  goal  state.  As  was 
described  in  the  previous  section,  GENERATE_GRAPH  returns 
pointers  to  the  graph  and  to  a  binary  search  tree 
superimposed  on  this  graph.  Each  node  in  the  graph 
contains  as  its  G_VALUE  its  minimum  distance  from  the  root 
of  the  graph,  or  in  this  case,  the  goal.  By  doing  a  simple 
search  for  the  current  state  in  the  search  tree,  the 
minimum  distance  n  can  be  quickly  obtained.  Of  course,  the 
overhead  of  generating  this  graph  is  considerable,  so  the 
graph  is  saved  and  reused  as  long  as  the  goal  state  remains 
the  same.  This  method  is  significantly  more  efficient  than 
using  an  admissible  heuristic  with  A*  to  calculate  this 
distance  every  time. 


4.  STATISTICS 


Although  not  strictly  a  part  of  the  code  directly 
associated  with  A*,  the  statistics  module  provides  routines 
which  are  useful  in  the  simulation  and  modeling  of 
heuristic  functions,  and  is  thus  included  in  the  set  of 
tools  provided.  In  addition,  the  applications  which  are 
described  in  this  document  rely  heavily  on  the  functions  in 
the  module. 

STATISTIC  provides  functions  which  generate  pseudo¬ 
random  numbers,  either  evenly  distributed  within  some  range 
of  values,  or  conforming  to  some  desired  distribution.  The 
underlying  function  used  is  MTH$RANDOM,  which  is  provided 
by  a  math  and  statistics  library  on  the  VAX,  and  which 
returns  uniformly  distributed  random  real  numbers  in  the 
range  [0,1],  given  an  integer  seed .  RANDOM_INTEGER_BETWEEN 
uses  this  function  to  generate  uniform  random  integers 
between  the  bounds  provided. 

The  technique  used  to  generate  random  numbers 
conforming  to  some  non-uniform  distribution  is  a  little 
more  complex,  and  requires  some  background  theory. 

Remember  that  the  only  random  generator  available  generates 
uniform  random  numbers  in  the  range  [0,1].  Suppose  however 
that  we  wish  to  obtain  randoms  whose  values  occur  with  a 
frequency  described  by  some  density  curve,  such  as  the 
normal  curve  shown  in  Figure  7.8(a).  There  is  another  way 
of  representing  this  desired  distribution  of  values,  using 
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what  is  called  the  probability  distribution  function,  shown 
for  this  example  in  Figure  7.8(b).  This  function  is 
effectively  the  integral  of  the  density  curve,  normalized 
to  a  value  of  1,  and  represents  the  probability  that  the 
variable  in  question  will  fall  below  the  domain  value  at 
that  point.  The  other  way  to  view  this  is  as  the  summation 
of  the  area  under  the  density  curve,  or  the  running  total 
of  the  frequencies  of  the  domain  values.  Thus,  point  A, 
with  coordinates  (XI, 0.25),  is  the  point  at  which  there  is 
a  25%  probability  that  the  X  value  will  fall  less  than  XI, 
or  alternately,  that  25%  of  the  total  number  of  X  values 
will  be  less  than  XI. 

The  technique  used  to  obtain  randoms  obeying  this 
distribution  is  as  follows.  First  a  uniformly  distributed 
random  number  lying  in  the  range  of  the  distribution 
function,  [0,1],  is  obtained.  The  corresponding  domain  (X) 
value  is  then  calculated  or  otherwise  extracted,  giving  one 
"hit"  at  that  value.  In  the  aggregate,  this  process  is 
essentially  reversing  the  function  that  is  the  "area  under 
the  curve",  or,  in  other  words,  taking  the  derivative  of 
this  distribution  function.  This  yields  X  values  whose 
frequencies  conform  to  the  desired  density  curve. 

Another  way  of  stating  this  (intuitively)  might  be  as 
follows:  25%  of  the  time  the  random  number  generator  will 

return  a  number  between  0  and  0.25.  The  inverse  function 
will  therefore  return  a  number  between  -  and  XI  (refer 


to  Figure  7.8)  25%  of  the  time,  as  it  should.  If  the  25% 
is  replaced  by  some  arbitrary  percentage  between  0  and  100, 
it  can  be  seen  that  the  generator  is  returning  numbers  in 
the  proper  range  of  values,  exactly  the  correct  percentage 
of  the  time.  Therefore,  it  must  be  generating  numbers  with 
the  correct  frequency. 

The  most  general  way  for  the  user  to  describe 
distribution  functions  is  through  enumeration  of  their 
domain  and  range  values  at  sufficiently  many  discrete 
points.  STATISTIC  provides  this  capability  through  the  use 
of  distribution  records.  A  DISTRIBUTION_RECORD  contains  a 
name  field,  which  specifies  the  type  of  the  distribution, 
and  a  field  which  tells  how  many  pairs  are  enumerated.  The 
third  and  fourth  fields,  ABSCISSA  and  ORDINATE,  are  real 
arrays  which  specify  the  function  values  at  each  of  up  to 
100  discrete  points.  DISTRIBUTION_TYPE  is  an  enumerated 
type  which  specifies  each  of  the  possible  distributions 
available  to  importers  of  this  module.  Distributions  are 
accessed  through  the  DISTRIBUTION  array,  which  contains 
pointers  to  all  of  the  distributions  currently  available. 
These  distributions  may  be  created  by  reading  them  from  an 
auxiliary  file  using  the  READ_DISTRIBUTIONS  procedure 
provided. 

Randoms  conforming  to  a  particular  distribution  are 
obtained  by  the  real  function  RANDOM_BY_DISTRlBUTION,  which 
accepts  as  input  a  constant  of  type  DI STRI BUTI 0N_T YPE 
specifying  the  desired  distribution.  Creation  of 


distributions  is  discussed  in  the  "use”  section  below. 

D.  USE  OF  THE  BEADS  WORLD  TOOLS 

The  previous  sections  have  presented  in  some  detail 
the  essential  data  structures  and  algorithms  implemented  in 
the  four  primary  tools  modules.  Before  going  on  to 
describe  our  applications,  which  use  these  tools,  it  will 
be  useful  to  describe  how  to  incorporate  these  tools  into 
applications. 

In  order  to  use  the  data  structures  and  procedures 
provided  by  a  tools  module,  it  is  necessary  to  import  this 
module.  This  involves  including  in  the  applications  module 
all  of  the  necessary  data  type,  variable,  and  procedure 
declarations,  with  appropriate  external  references  in  the 
latter  two  cases,  and  then  using  the  VAX  Linker  to  link 
these  compiled  modules  together.  Because  type  checking  is 
not  performed  across  module  boundaries,  it  is  imperative 
that  all  declarations  of  data  and  procedures  match  exactly 
in  every  module.  In  order  to  make  this  convenient,  special 
definition  files  for  each  tools  module  are  included  in  this 
package. 

The  definition  files  are  listed  in  Appendix  B.  As  an 
example,  the  definition  file  corresponding  to  CONTROL. PAS 
is  called  CONTROL. DEF.  It  was  created  by  deleting  all  code 
and  all  local  procedures  from  CONTROL. PAS,  leaving  only  the 
native  data  declarations  and  global  procedure  headings.  By 
including  this  file  in  an  application  module  and  then 


deleting  all  unreferenced  declarations,  the  user  is  able  to 
provide  all  of  the  necessary  linkage  with  the  CONTROL 
module  without  worrying  about  data  type  or  argument  list 
agreement.  Of  course,  this  mechanism  is  only  useful  as 
long  as  the  .PAS  and  .DEF  files  are  in  complete  agreement. 
This  means  that  whenever  changes  are  made  to  an  existing 
tools  module,  its  corresponding  definition  file  must  be 
updated,  as  well  as  any  other  existing  modules  which  import 
it.  This  mechanism  is  certainly  not  as  convenient  as  the 
true  module  capability  provided  by  languages  such  as  Modula 
2,  but  it  is  better  than  re-creating  all  of  the  definitions 
each  time  a  new  application  is  written. 

The  next  important  thing  to  discuss  is  the  proper 
initialization  of  each  of  these  modules.  The  controls 
module,  heuristics  module,  and  statistics  module  each  have 
static  data  structures  which  need  to  be  created  and/or 
assigned  initial  values  before  the  routines  inside  these 
modules  are  used.  In  order  to  make  this  initialization 
convenient,  each  module  exports  an  initialization  procedure 
which  performs  this  when  called.  A  good  rule  to  follow 
when  using  these  modules  in  applications  is  to  call  the 
initialization  procedure  for  each  module  that  is  imported. 
Because  the  control  module  already  imports  from  the 
heuristics  module,  which  in  turn  imports  from  the 
statistics  module,  it  turns  out  that  applications  that  use 
these  modules  need  only  call  the  INITIALIZE_C0NTR0LS 


procedure  exported  by  the  controls  module.  Each  of  the 
initialization  routines  may  be  invoked  separately,  and 
multiple  invocations  cause  no  undesirable  side  effects. 

The  auxiliary  data  files  which  contain  the  profiles 
and  distributions  used  by  the  heuristics  and  statistics 
modules  are  organized  into  specific  formats.  Profiles  of 
each  heuristic  contain  a  header  line  which  has  a  ten 
character  name  field,  the  heuristic  number,  and  the  number 
of  entries  for  that  heuristic.  Each  entry  has  four 
numbers.  The  first  is  the  true  distance  n.  The  second  is 
the  estimated  distance  k.  The  fourth  is  the  total  number 


of  times  that  this  k  was  obtained  for  this  n.  The  third 
field  is  the  percentage  out  of  all  the  samples  at  that  n, 
effectively  normalizing  the  histograms  at  each  n.  Several 
profiles  can  be  included  in  the  same  file.  Figure  7.9 
illustrates  the  file  format  of  profile  data. 


Figure  7.9  Example  of  the  data  in  a  profile 
auxiliary  file. 
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Figure  7.10  shows  the  format  of  a  distribution  file. 
The  first  line  of  each  distribution  is  also  a  header.  The 
first  field  is  a  constant  of  the  enumerated  type 
DISTRIBUTION_TYPE,  and  identifies  the  distribution.  The 
second  item  is  a  number  telling  how  many  discrete  point 
pairs  follow.  Each  line  after  that  consists  of  two  real 
numbers,  the  abscissa  and  ordinate,  describing  the 
distribution  function  at  each  discrete  point.  Several 


distributions  may  also  be  included  in  the  same  file 


Figure  7.10  Distribution  file  format. 
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The  two  additions  or  changes  that  users  writing  new 
applications  with  these  tools  are  most  likely  to  make  are 
the  addition  of  heuristics  and  distributions  to  those 
already  provided.  Heuristics  may  be  easily  added  to  the 
heuristics  module.  First,  the  heuristic  function  is 
created  and  inserted.  Then  an  identification  number  is 
assigned  to  this  heuristic,  and  a  corresponding  call  to  the 
function  is  added  to  the  CASE  statement  in 


ESTIMATED_DISTANCE.  The  system  currently  allows  for  up  to 
24  heuristics,  but  this  number  can  be  easily  increased. 

The  addition  of  different  distributions  is  also  fairly 
simple.  A  name  for  each  new  distribution  must  be  added  to 
the  DISTRIBUTION_TYPE  list;  these  names  are  the  key  to 
accessing  distribution  functions  from  the  heuristics 
module.  Then  the  corresponding  distribution  function  data 
must  be  created  in  the  proper  file  format,  calculated 
either  by  hand  or  by  a  throw-away  program. 

E.  APPLICATIONS  OF  THE  BEADS  WORLD  TOOLS 

Three  primary  applications  programs  were  developed  to 
generate  the  data  that  is  presented  in  the  other  sections 
of  this  document,  and  to  display  that  data  in  meaningful 
form.  Two  of  these  programs,  the  ones  responsible  for 
gathering  the  data,  incorporate  the  tools  that  have  been 
described  in  the  previous  pages.  The  third  program  is  a 
FORTRAN  graphics  package  that  displays  the  data  in  a 
variety  of  formats,  and  does  not  use  any  of  the  tools 
previously  described.  These  three  applications  are 
described  in  the  following  sections,  which  serve  as  both 
the  sole  documentation  for  these  programs  and  as  examples 
of  the  use  of  the  tools  previously  described.  Future  users 
of  this  software  package  may  be  able  to  use  these 
applications  directly,  or  may  wish  to  alter  them  or  use 
them  as  models  to  create  programs  more  tailored  to  their 
specific  needs. 


1.  GRAPH  GENERATION  AND  ANALYSIS 


The  first  application  described  in  this  section  is  the 
GRAPH_SPACE  program  module  listed  in  Appendix  C.  This 
program  provides  the  user  with  three  capabilities.  By 
varying  the  commands  and  data  in  the  input  file,  the  user 
can  generate  the  entire  graph  for  a  given  Beads  World 
configuration  and  list  its  features;  the  user  can  generate 
a  sample  set  of  start  and  goal  state  pairs  from  the  graph 
for  use  as  input  to  search  algorithms;  finally,  the  user 
can  actually  print  all  of  the  puzzle  states  in  the  graph. 

A  sample  input  data  file  for  GRAPH_S PACE  is  shown  in 
Figure  7.11.  Each  operation  is  represented  by  two  lines  of 
data.  The  first  line  specifies  the  configuration  of  the 
Beads  World  to  be  used,  as  well  as  the  operation  to 
perform.  The  first  two  items  in  the  line  are  the  number  of 
positions  and  the  number  of  links,  in  that  order. 

Following  that  are  the  position  numbers  of  the  links.  The 
next  number  is  the  opcode.  There  are  three  possible 
opcodes;  0  causes  generation  of  the  graph  and  the  printing 
of  its  characteristics;  1  generates  sample  state  pairs  from 
the  graph;  2  causes  the  graph  to  be  printed.  If  the 


Figure  7.11  Sample  input  data  for  GRAPH  SPACE.  Note  that 
three  operations  are  specified  by  the  file. 
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The  second  line  of  input  data  contains  only  one  item; 
this  is  the  starting  puzzle  state  at  which  the  graph  is  to 
be  rooted. 

After  reading  in  the  two  lines  which  form  each 
command,  GRAPH_S PACE  invokes  the  GENERATE_GRAPH  procedure 
from  the  controls  module.  This  returns  a  pointer  to  the 
graph  rooted  at  the  starting  state,  a  pointer  to  the  binary 
search  tree  containing  the  nodes  of  the  graph,  and  a  graph 
description  record.  If  the  opcode  is  a  zero,  then  an 
output  procedure  prints  the  puzzle  configuration,  the 
starting  state,  and  the  information  contained  in  the  graph 
descriptor.  As  a  visual  aid  this  procedure  also  generates 
a  histogram  of  the  number  of  nodes  at  each  level  in  the 
graph.  An  example  of  this  output  is  shown  in  the  tables  in 
Section  3. 


opcode  sample  size 
(optional) 

1  5 
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• >  ■  */ 
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If  the  opcode  is  a  1,  then  GRAPH_S PACE  generates 
sample  (start,  goal)  puzzle  state  pairs  from  the  graph. 

This  is  done  by  repeatedly  extracting  at  random  a  puzzle 
state  from  each  level  of  the  graph.  If  a  minimum  sample 
size  is  specified,  then  at  least  that  many  states  are  taken 
from  each  level,  if  available.  In  addition,  in  order  to 
reflect  the  distribution  of  states  in  the  graph,  one- 
hundred  additional  states  are  extracted,  in  approximate 
proportion  to  the  graph  space  histogram  mentioned  above. 
These  states  are  printed  out  in  pair  with  the  root  state  of 
the  graph.  In  addition,  the  distance  between  these  states, 
which  is  the  level  from  which  each  goal  state  was 
extracted,  is  also  printed. 

The  third  option  (opcode  *  2)  causes  GRAPH_space  to 
enumerate  the  states  in  the  graph.  This  is  done 
recursively  by  neighbors  until  the  nodes  at  the  last  level 
are  reached.  An  example  of  this  is  shown  for  the  3-puzzle 
in  Figure  7.12.  At  some  60  lines  per  page,  this  procedure 
could  be  quite  expensive  for  puzzles  larger  than  the  6- 
puzzle. 


Figure  7.12  Puzzle  states  for  the  3-puzzle 
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2.  PUZZLE  SOLUTIONS  WITH  A* 

The  second  application  program,  SOLVE,  is  listed  in 
Appendix  C.  This  program  provides  a  very  flexible  vehicle 
for  gathering  data  on  A*  search.  By  varying  the  input 
commands  and  data  provided  to  SOLVE,  the  user  can  run 
either  the  ordered  search  or  graph  search  versions  of  A*, 
using  any  heuristic,  at  any  weight,  for  any  of  the  possible 
Beads  World  configurations.  In  addition,  the  user  has  a 
variety  of  options  for  collecting,  synthesizing,  and 
displaying  this  data. 


Figure  7.13  Sample  input  data  for  SOLVE 
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The  format  for  the  input  data  is  shown  in  Figure  7.13. 
As  with  the  GRAPH_S PACE  program  described  above,  the  first 
line  of  input  configures  the  programs  for  the  proper  Beads 
World  model.  Opcodes  can  assume  values  of  0  to  3;  each  of 
these  resulting  operations  are  described  below.  Finally, 
the  last  item  on  the  first  line  specifies  which  search 
method  is  to  be  used:  ordered  search  or  graph  search. 

The  second  line  of  input  tells  SOLVE  which  heuristics 
and  which  weights  are  to  be  applied  to  each  problem.  The 
first  item  is  the  number  of  heuristics.  This  is  followed 
by  a  list  of  the  numeric  identifiers  of  those  heuristics. 
The  second  set  of  numbers  is  the  number  of  desired  weights, 
followed  by  a  list  of  those  weights. 

The  third  line  contains  two  items,  both  of  which  are 
optional.  The  first  item  is  the  file  name  of  the  profile 
input  file,  enclosed  in  ""  delimiters.  If  no  profiles  are 
needed,  then  this  field  may  be  left  blank  (i.e.  no 


characters  between  the  delimiters) .  The  second  item  is  the 
profile  output  file  name.  If  this  is  not  specified  then  no 
output  files  are  created. 

The  fourth  line  also  contains  an  optional  file  name 
field,  which  specifies  in  what  file  the  distribution 
function  description  data  is  located.  If  not  specified,  no 
attempt  is  made  to  create  distribution  records,  and 
heuristics  which  use  distributions  cannot  be  used. 

The  input  data  consists  of  an  unspecified  number  of 
puzzle  problem  entries.  Each  entry  consists  of  three 
values.  The  first  is  the  minimum  distance  between  the 
start  and  goal  states  in  the  graph.  The  second  and  third 
items  are  the  state  descriptions  of  the  start  and  goal, 
respectively.  Note  that  these  entries  are  in  the  same 
format  as  those  generated  by  GRAPH_S PACE  when  instructed  to 
generate  sample  pairs. 

As  mentioned  before,  the  opcode  can  specify  one  of 
four  distinct  actions.  The  first  (opcode  =  0)  results  in 
the  printing  of  a  complete  description  of  the  puzzle 
problem  and  of  the  performance  of  the  search  algorithm  in 
finding  a  solution.  The  second  option  (opcode  =  l)  is  the 
same  as  the  first,  with  the  additional  feature  of  printing 
the  puzzle  states  lying  on  the  solution  path.  The  third 
option  (opcode  =2)  is  used  whenever  the  data  set  is  too 
large  to  be  displayed  using  one  of  the  first  two  options, 
or  when  it  is  more  informative  to  see  the  data  condensed 
for  comparison.  The  raw  data  is  simply  printed  for  each 
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puzzle  instance  (start  and  goal  pair  at  every  heuristic  and 
weight)  according  to  a  specific  format.  Figures  7.14(a) 
through  7.14(c)  give  examples  of  each  of  these  options. 


The  fourth  option  is  significantly  different  from  the 
first  three.  In  order  to  provide  an  aggregate  measure  of 
the  performance  of  the  algorithm  on  the  input  data  across 
the  heuristics  and  weights,  the  results  of  each  run  must  be 
aggregated  for  each  level  n.  The  format  of  this 
performance  data  is  shown  in  Figure  7.14(d). 

There  are  two  basic  measures  of  performance.  The 
first  is  the  number  of  nodes  expanded.  For  every  n  and 
every  heuristic  at  every  weight,  there  are  three  entries: 
the  minimum  number  of  nodes  expanded,  the  maximum  number  of 
nodes  expanded,  and  the  mean  number  of  nodes  expanded.  The 
second  performance  measure  is  path  length,  and  again  for 
every  n,  heuristic,  and  weight,  there  are  three  entries: 
minimum  path  length,  maximum  path  length,  and  mean  path 
length.  This  data  is  organized  into  two  groups  of  three 
lines  each,  as  shown  in  Figure  7.14(d).  This  data  is  also 
in  the  proper  format  for  processing  by  the  graphical 
display  package  which  is  described  in  the  next  section. 


Figure  7.14(a)  Printed  results  using  option  '0*  with 

SOLVE . 
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Figure  7.14(c)  Printed  results  for  option  ’2'  with 

SOLVE.  Note  the  tabular  form. 
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Figure  7.14(d)  Printed  aggregate  results  for  option  '3 

with  SOLVE. 
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3.  GRAPHIC  DISPLAY  OF  RESULTS 

The  SOLVE  application  described  above  generates  large 
volumes  of  data  which,  if  left  in  tabular  form,  can  be 
rather  difficult  to  analyze  and  evaluate.  Transforming 
such  tables  into  graphical  form  makes  it  easier  to  detect 
trends  and  to  observe  interesting  behavior  in  the  data.  A 
good  picture  is  worth  a  thousand  words,  and  in  this 
setting,  graphs  serve  to  provide  a  descriptive  and  precise 
summary  of  the  execution  results. 

A  powerful  commercial  graphics  package  called  DISSPLA 
is  available  on  the  RICC  VAX-11/780;  it  was  found  to  be  a 
versatile,  well-documented,  and  fairly  easy-to-use  package 
that  not  only  offered  all  of  the  graphing  formats  required, 
but  also  permitted  their  review  on  either  the  terminal  or 
in  hard  copy.  However,  there  is  such  a  variety  of  methods 
and  combinations  in  which  to  view  the  volumes  of  data  that 
it  became  necessary  to  create  a  tool  to  gather  the  specific 
graph  parameters  from  the  user,  extract  the  required  data 
from  the  data  base,  and  make  the  necessary  calls  to  DISSPLA 
to  finally  provide  the  graph.  This  tool  takes  the  form  of 
a  basic  program  framework,  rather  than  a  general,  all- 
encompassing  package,  because  it  is  tied  so  closely  to  the 
data,  and  because  of  the  varied  nature  of  the  graphs 
required.  This  tool  provides  a  basic,  moldable  framework 
that  can  be  easily  tailored  to  specific  needs. 

The  basic  framework  is  written  in  VAX  FORTRAN,  and 


consists  of  code  performing  four  distinct  tasks:  (l) 
input  of  data,  (2)  menu  operation,  (3)  axis  and  graph 
set-up,  and  (4)  curve  plotting.  The  input  routine  reads 
the  data  from  the  appropriate  aggregate  data  output  file 
generated  by  SOLVE.  The  menu  presents  the  user  with  a 
variety  of  viewing  options  and  plotting  combinations  from 
which  to  select.  The  axis  set-up  routines  establish  the 
appropriate  type  of  graph  with  DISSPLA,  based  on  the 
parameters  selected  by  the  user  in  the  menu  routine. 
Finally,  the  curve  plotting  routine  extracts  the 
appropriate  data  from  the  data  base  for  each  curve  selected 
by  the  user,  and  calls  DISSPLA  routines  to  plot  and  label 
each. 

Appendix  C  includes  the  code  from  two  of  the  main 
graphing  tools  created  from  this  basic  framework:  GRAFER 
and  GRAFPROF.  GRAFER  generates  four  types  of  complexity 
graphs:  (1)  X  versus  N  (see  Figure  6.1(b)),  (2)  X  versus  W 

(see  Figure  6.9),  (3)  L  versus  N  (see  Figure  6.6),  and  (4) 

L  versus  W  (see  Figure  6.15).  Input  to  GRAFER  must  be  in 
the  format  shown  in  Figure  7.14(d),  residing  in  a  file 
named  SEARCH. OUT. 

The  other  routine,  GRAFPROF,  provides  profile  graphs  in 
one  of  two  forms:  (1)  two-dimensional  K  versus  I  (see 
Figure  8.2(a)),  and  (2)  three-dimensional  K  versus  I  versus 
Frequency  (see  Figure  8.2(b)).  The  input  to  GRAFPROF  must 
be  in  the  format  depicted  in  Figure  7.9,  residing  in  a  file 
called  PROFILE. RUN.  (Source  profiles  are  optionally  read 


from  a  file  called  PROFILE. PRO. ) 

As  indicated,  these  graphing  routines  stem  from  the 
same  general  framework.  Other  minor  modifications  resulted 
in  variations  that  created  the  graphs  shown  in  Figures  5.6 
through  5.17  (directly  comparing  Graph  search  and  Ordered 
search),  and  Figures  5.18  through  5.26  (comparing  F  and  G 
as  discriminators  in  A*) .  These  are  mentioned  as  testimony 
to  the  versatility  of  the  framework,  should  future  research 
require  modification  of  these  applications. 

F.  ADDITIONS  AND  ENHANCEMENTS 

As  with  any  large  and  complicated  piece  of  software, 
this  package  has  undergone  almost  constant  evolution,  as 
use  has  brought  out  shortcomings  and  possible  improvements 
in  its  features.  In  fact,  the  authors  do  not  view  this 
software  as  complete,  but  rather  as  evolved  to  a  stable 
enough  point  to  be  useful  to  other  researchers.  In  this 
spirit,  several  possible  additions  and  enhancements  are 
suggested,  to  make  this  package  an  even  better  tool.  These 
suggestions  fall  into  two  categories,  discussed  below. 

Most  of  the  suggestions  for  improvement  fall  under  the 
heading  of  efficiency.  It  was  felt  to  be  very  important  to 
maintain  the  flexibility,  maintainability,  and 
understandability  of  this  package.  To  a  large  extent, 
careful  packaging  and  the  use  of  modules  has  accomplished 
this.  However,  this  has  also  meant  that  some  sacrifices 
have  been  made  in  efficiency.  The  major  limitation  on  the 


use  of  this  package  is  the  tremendous  amount  of  CPU  time 
that  is  required  to  obtain  a  statistically  significant 
amount  of  data.  One  of  the  largest  time  costs  associated 
with  this  implementation  is  the  large  number  of  system 
calls  performed  while  doing  dynamic  allocation.  A  major 
savings  could  be  obtained  by  replacing  the  dynamically 
allocated  neighbor  lists  with  static  neighbor  arrays  in 
each  puzzle  node.  Ultimately,  this  does  not  greatly 
increase  the  required  memory,  and  does  significantly  reduce 
the  number  of  system  calls.  Even  more  drastically,  if  the 
user  is  willing  to  restrict  the  maximum  beads  world  puzzle 
size  to,  for  example,  the  "6-puzzle",  then  static 
allocation  is  feasible  for  all  of  the  puzzle  nodes  as  well. 
Dramatic  time  savings  would  then  be  possible. 

Other  areas  of  efficiency  include  reducing  the  number 
of  procedure  calls  and  optimizing  certain  algorithms  and 
sections  of  code.  This  usually  involves  obscuring  the 
functionality  of  the  code,  but  may  be  worthwhile  if 
sufficient  CPU  time  savings  are  realized. 

The  second  area  of  improvement  is  that  of  "user- 
friendly"  operation.  The  primary  motivation  for  using 
input  data  in  the  standard  input  file  as  both  command  and 
data  was  to  allow  the  use  of  the  batch  queue  to  submit 
large  data  runs  without  tying  up  a  terminal  or  account  for 
several  hours  or  days.  However,  it  would  be  nice  if  some 
form  of  interactive  or  menu-driven  command  mode  were 
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VIII.  SIMULATING  HEURISTIC  BEHAVIOR 

A.  INTRODUCTION 

This  chapter  deals  with  a  means  of  comparing  the 
results  of  classes  of  heuristics  whose  statistical 
estimating  behavior  is  stochastically  the  same.  There  are 
two  related  goals:  (1)  to  empirically  verify  claims  in  the 
literature  that  heuristics  sharing  the  same  KMIN  and  KMAX 
estimate-bounding  functions  (called  profiles)  are 
"equivalent",  and  (2)  to  study  how  completely  these 
statistical  profiles  capture  the  essence  and  power  of  the 
heuristic  which  they  represent.  This  chapter  proceeds  by 
describing  exactly  what  a  profile  consists  of  and  how  these 
profiles  were  constructed  and  used,  followed  by  a 
discussion  of  the  above  objectives  in  detail,  a  summary  of 
the  simulated  heuristics  used,  and  finally  concludes  by 
examining  the  results  obtained  by  simulation  with  these 
profiles. 

1.  WHAT  IS  A  PROFILE? 

In  general,  at  any  given  time,  there  are  many  nodes  in 
the  search  graph  at  distance  i  from  the  goal.  The 
estimates  calculated  by  a  heuristic  function  (K)  from  the 
set  of  nodes  at  level  i  to  the  goal  will  span  a  range  of 
values,  with  upper  and  lower  limits  which  shall  be  called 
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KMIN(i)  and  KMAX(i) ,  respectively.  This  set  of  numeric  data 
collected  over  all  i  characterizes  a  heuristic  in  terms  of 
the  bounds  on  its  error  behavior.  In  addition  to  KMIN(i) 
and  KMAX(i),  other  data  that  could  be  collected  to  further 
characterize  a  heuristic  includes  the  mean  of  its  estimates 
at  each  i  (called  KMEAN(i)),  the  standard  deviation  at  each 
i  (called  ST_DEV(i)),  and  the  frequency  of  each  of  the 
estimates  for  each  i  (called  the  actual  distribution) .  This 
set  of  statistical  data  collected  on  a  heuristic  is 
referred  to  as  a  "profile". 

2.  EQUIVALENCE  OF  HEURISTICS 

Gaschnig  (1979,  Pg  84)  claimed: 

"Two  K  functions  are  equivalent  iff  their 
corresponding  KMIN  and  KMAX  functions  are 
identical.  We  have  blurred  the  distinction 
between  all  K  functions  that  happen  to  have  a 
particular  KMIN  and  KMAX  as  bounding  functions". 

Therefore,  if  two  heuristics  are  different  in  their 
manner  of  estimating  the  distance  to  the  goal,  but  always 
arrive  at  some  distribution  of  values  within  the  bounds 
KMIN  and  KMAX,  according  to  Gaschnig,  they  are  termed 
•equivalent1.  Equivalent  heuristics  should  produce  similar 
results  in  terms  of  the  number  of  nodes  expanded,  solution 
path  length,  and  behavior  at  various  weights.  However, 
Gaschnig’ s  ’definition’  of  equivalence  appears  to  be 
somewhat  simplistic,  because  even  though  two  heuristics  may 
share  the  same  bounding  functions,  at  any  given  node  they 
can  calculate  entirely  different  results.  (This  is 


referred  to  as  "timing" . ) 

We  indicated  earlier  that  heuristics  can  be 
characterized  by  profiles  that  consist  of  the  statistics 
KMIN,  KMEAN,  KMAX,  ST_DEV;  and  the  frequency  distribution 
of  values.  If  two  heuristics  were  to  share  an  entire 
profile,  and  not  just  KMIN  and  KMAX,  they  could  be  declared 
equivalent  with  greater  confidence  because  the  basis  of  the 
equivalence  stems  from  a  more  specific  description  of  their 
respective  behavior.  Although  these  additional  measures 
don't  guarantee  that  the  two  will  behave  identically  at 
every  node,  at  least  they  do  insure  that  the  two  will  have 
the  same  aggregate  statistical  behavior,  which  is  not  the 
case  using  only  KMIN  and  KMAX. 

One  objective  then,  is  to  empirically  study  the  results 
of  several  different  heuristics  whose  'equivalence'  is 
based  on  various  aspects  of  their  respective  profiles  (i.e. 
heuristics  bounded  by  the  same  KMIN  and  KMAX,  or  heuristics 
sharing  the  same  KMEAN,  or  even  heuristics  sharing  an 
entire  profile) . 

3.  COMPLETENESS  OF  PROFILE 

Another  way  of  looking  at  equivalence  is  to  say  that  a 
full  profile  characterizes  the  heuristic  completely  enough 
that  the  profile  could  be  used  in  place  of  the  actual 
heuristic  (in  other  words,  the  profile  could  be  used  to 
simulate  the  original  heuristic) ,  and  the  resulting 
performance  in  terms  of  number  of  nodes  expanded  should  be 


identical.  An  additional  objective  then,  is  to  see  how 
completely  the  profile  characterizes  the  actual  heuristic 
on  which  it  was  based. 

4.  WHAT  IS  SIMULATION? 

Typically,  heuristics  take  the  form  of  an  equation  or 
formula  that  evaluates  particular  aspects  of  a  given  node's 
state  and  calculates  an  estimate  based  on  what  it  finds. 

For  example,  the  heuristic  K1  described  in  Chapter  VI 
counted  the  number  of  tiles  out  of  place  as  its  estimate. 

In  simulation,  the  basic  technique  is  to  eliminate  the  need 
for  a  formula  or  equation  that  is  dependant  upon  a  given 
node's  configuration,  introducing  a  level  of  abstraction 
between  the  heuristic  and  the  domain  by  using  statistics 
stemming  from  observed  behavior  elsewhere  in  the  real 
world.  The  simulation  is  accomplished  through  use  of 
profiles  and  "contrived"  heuristics,  where  the  contrived 
heuristic  is  a  black-box  that  bases  its  estimate  on  the 
information  given  in  another  heuristic's  profile  rather 
than  by  calculating  it  from  bead  configurations.  In 
essence,  the  contrived  heuristic  is  a  "copy-cat"  heuristic 
that  "says  what  the  other  guy  said"  and  has  no  information 
of  its  own  to  offer. 

Our  contrived  heuristics  use  the  profiles  simply  as  a 
'look-up'  device:  to  evaluate  a  given  node,  instead  of 
looking  at  its  bead  configuration  (which  a  real  heuristic 
would  have  to  do) ,  the  contrived  heuristic  looks  up  what 
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another  heuristic  calculated  for  all  nodes  at  distance  i 
from  the  goal,  and  returns  some  value  based  on  the 
statistics  found  in  that  profile.  Since  the  profile  is 
made  up  of  several  categories  of  figures,  this  makes 
available  a  variety  of  numbers  to  choose  from  and  return. 
Some  of  the  many  values  that  could  be  returned  include: 

(1)  KMIN(i) 

(2)  KMAX(i) 

(3)  KMEAN(i) 

(4)  KMEAN(i)  +/-  N  standard  devations 

(5)  a  random  value  between  KMIN(i)  and  KMAX(i) 

(6)  a  random  value  normally  distributed  around 

KMEAN(i) . 

(7)  a  random  value  selected  according  to  the  actual 

distribution  of  estimates  in  the  profile. 

(8)  KMEAN(i)  +  N  standard  deviations  if  on  path, 
KMEAN(i)  -  N  standard  deviations  if  off  path. 

etc. 


So  essentially,  a  profile  supplies  the  statistical 
performance  information  gathered  on  an  actual  heuristic, 
and  the  contrived  heuristic  decides  which  categories  from 
that  profile  to  use  in  forming  its  plagiarized  estimate. 
This  choice  can  have  a  dramatic  effect  on  the  performance 
results:  consider  the  difference  in  search  performance  of  a 
contrived  heuristic  returning  KMEAN  as  its  estimate,  and 
another  using  KMAX.  (One  would  expect  better-than-average 
performance  using  the  former,  and  worse-than-average 
results  using  the  latter) .  However,  since  the  typical 
real-world  heuristic  does  not  merely  return  one  single 
value,  to  be  more  realistic,  the  contrived  heuristic  could 


emulate  this  behavior  by  varying  its  estimates  using  any  of 


the  techniques  offered  in  options  5,  6,  and  7  above. 

The  technique  that  the  contrived  heuristic  uses  to 
derive  its  plagiarized  estimate  from  the  profile  ultimately 
defines  a  particular  pattern  or  distribution  of  values. 

For  example,  if  only  KMAX  is  returned,  the  distribution  of 
values  returned  over  the  course  of  the  run  will  be  the 
value  KMAX,  with  no  deviation.  Using  an  option  such  as 
expressed  in  items  5,  6,  or  7  above  results  in  a  wider 
distribution  that  spans  several  values  at  each  i.  The 
selection  of  which  distribution  to  use  depends  upon  the 
objective  of  the  simulation:  one  could  choose  a 
distribution  to  focus  on  a  particular  type  of  behavior 
(like  options  1,  2,  3,  4,  and  8  above),  or  the  objective 
might  be  to  attempt  to  duplicate  the  performance  of  the 
original  heuristic.  The  options  and  distributions  that  we 
implemented  for  our  contrived  heuristics  are  discussed  in 
detail  following  a  detailed  description  of  the  process  used 
to  build  profiles. 

B.  GENERATING  PROFILES 

The  simulation  process  depends  upon  having  a  good  set 
of  profiles  derived  from  actual  heuristics,  and  had  to  be 
gathered  before  any  experimental  work  could  proceed.  This 
process  involves  collecting  two  figures  on  a  given  pair  of 
puzzle  configurations:  one  figure  is  the  true  minimum 
distance  (i)  between  the  two  states,  and  the  other  is  the 
heuristic's  estimate  of  what  this  actual  distance  was. 


The  collection  of  these  two  values  over  a  large  number  of 
state  pairs  (we  call  them  start/goal  pairs)  characterizes 
the  heuristic  being  used  so  that  at  any  particular  distance 
i,  this  profile  could  divulge  the  absolute  minimum  and 
maximum  values  observed,  and  also  the  mean  and  standard 
deviation  of  the  aggregate  sample. 

Gaschnig  (1979,  pp.  39-42)  used  two  methods  to  obtain 
figures  for  his  profiles.  For  the  first  method,  he  used 
the  895  start/goal  states  of  his  solution  sample  as  one  set 
on  which  to  gather  figures.  He  knew  the  actual  distances, 
which  were  determined  as  the  pairs  were  created,  and  only 
had  to  let  his  heuristics  estimate  these  distances  to 
provide  the  necessary  figures  to  build  the  profile. 

However,  the  results  from  this  method  were  rejected  because 
the  sample  size  was  based  on  only  895  values,  which  was 
felt  to  be  too  small  to  be  meaningful. 

The  second  method  employed  by  Gaschnig  was  to  use  the 
nodes  generated  in  the  search  tree  during  the  solution  of 
the  895  problems;  by  comparing  each  node  generated  with 
the  root  node,  a  greater  number  of  pairs  could  be  sampled. 
Since  the  actual  distance  of  each  node  from  the  root  was 
known  (simply  the  value  G  since  he  used  A*  with  an 
admissible  F) ,  all  he  needed  to  do  was  to  let  his 
heuristics  provide  their  estimate  of  this  distance,  and 
collect  the  resulting  values.  This  method  provided  11,448 
•’start/goal"  state  comparisons. 

Neither  method  appealed  to  us.  The  first  method  is 


based  on  too  small  a  sample;  the  second  method  can  bias  the 
results,  because  it  only  samples  nodes  on  or  close  to  the 
solution  path.  Thus  it  might  not  represent  the  heuristic's 
full  range  of  estimates. 

An  alternative  method  was  devised  instead.  From  a 
given  start  state,  the  complete  state  space  was  built  by 
using  the  same  basic  process  that  generated  the  198 
start/goal  pairs  (described  in  Chapter  IV) ,  but  modified 
specifically  to  gather  the  profile  information  that  we 
needed.  The  random  selection  mechanism  from  that  process 
was  then  employed  to  select  a  proportional  number  of  nodes 
from  each  level  of  the  state  space.  For  each  node 
selected,  the  heuristic  estimated  its  distance  from  the 
start,  or  root  node.  The  actual  distance  was  the  level  of 
the  node,  or  the  value  G,  since  the  state  space  was  built 
using  A*  with  an  admissible  F. 

This  method  was  attractive  because  it  randomly  selected 
nodes  at  each  level  in  proportion  to  the  total  number  of 
nodes  at  each  level  of  the  state  space.  Also,  the  nodes 
selected  were  not  on  any  particular  path  (unlike  Gaschnig's 
second  technique  which  tended  to  sample  nodes  along  the 
route  to  the  goal) .  So,  not  only  were  duplicates  minimized, 
but  in  addition  any  node  in  the  tree  had  an  equal  chance  of 
being  picked.  Taken  over  a  wide  number  of  search  trees, 
this  method  thoroughly  tested  the  heuristics,  and  should 
provide  profiles  that  represent  the  true  range  of  values  in 


a  general  setting. 


1.  COMPARING  GENERATION  METHODS 
For  comparison,  we  created  profiles  using  both 
Gaschnig's  second  technique  (referred  to  as  Method  A)  and 
the  alternative  method  described  above  (called  Method  B) . 
Method  A  gathered  heuristic  estimates  for  Kl,  K2,  and  K3 
during  the  solution  of  198  start/goal  pairs  at  seven 
weights,  and  Method  B  was  created  from  100  arbitrarily 
selected  start  configurations  (and  hence,  100  different 
search  trees) . 

Table  8.1  compares  the  number  of  samples  taken  from  the 
various  levels  using  the  two  methods. 
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TABLE  8.1 

Profile  Sample  Sizes 

Level 

Method  A 

Method  B 

6 

10004 

578 

7 

16233 

620 

8 

25064 

683 

9 

49972 

824 

10 

73930 

1027 

11 

121274 

1235 

12 

154682 

1557 

13 

190686 

1810 

14 

175902 

1972 

15 

141106 

1818 

16 

90395 

1574 

17 

46899 

1128 

18 

19614 

763 

19 

8955 

579 

20 

2938 

396 

mple  Size: 

1,151,457 

18,756 

Table  8.1  shows  that  many  more  values  were  gathered 
using  Method  A  (over  1  million  compared  to  18  thousand  for 
Method  B) .  However,  Table  8.2  (below)  compares  the  results 
from  both  profiling  methods,  and  shows  that  not  only  is  the 
span  of  values  narrower  using  Method  A  (at  level  8,  Method 
A  only  had  estimates  4,  5,  and  6,  while  Method  B  included 
3,  4,  5,  and  6),  but  also  that  the  distribution  is 
different  from  Method  B's  values  (at  level  4,  Method  A 
returned  the  value  4  40%  of  the  time,  while  Method  B 
returned  a  4  69%  of  the  time) ,  in  spite  of  the  fact  that 
Method  B  had  a  much  smaller  number  of  values  in  its  sample. 
Only  two  levels  are  shown  from  the  values  collected  for 


heuristic  Kl,  but  the  results  at  all  other  levels  and 
heuristics  reflect  this  same  trend. 


TABLE  8.2 

Disparity  of  Distributions 


Actual 
Distance  (i) 

Heuristic 
Estimate  (K) 

Method 

A 

Method 

B 

4 

3 

60% 

31% 

4 

4 

40% 

69% 

8 

3 

0% 

5% 

8 

4 

32% 

13% 

8 

5 

54% 

57% 

8 

6 

14% 

25% 

Figure  8.1  provides  a  graphical  representation 
comparing  the  values  gathered  using  Method  B  and  Gaschnig's 
profile  for  K3,  and  brings  up  an  interesting  point.  In 
section  6,  we  were  disturbed  that  the  profile  from  our 
version  of  K3  was  not  the  same  as  Gaschnig's  because  our 
graphs  differed  slightly  (see  Figure  6.21).  However,  the 
new  profile  appears  to  be  in  much  greater  agreement  (Figure 
8.1b).  Note  that  our  K3  (KMIN)  underestimates  when  the 
distance  from  the  goal  (i)  is  high.  However,  observe  that 
Gaschnig's  K3  overestimated,  again  raising  the  question  if 
our  K3  was  interpreted  correctly.  In  spite  of  this 
apparent  discrepency,  we  still  claim  our  version  of  K3  is 
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correct,  and  for  support,  provide  an  8-Puzzle  example  where 
the  heuristic  underestimates  the  true  distance  from  the 
goal: 


Start 


6 

7 

8 

5 

1 

3 

2 

Goal 


2 

- 1 - 

3  1  4 

1 

5  1 

8  j  7 

6 

K3  =  K2  +  3  *  seq 


The  actual  distance  to  the  goal  in  this  example  is  28 
moves,  or  simply  the  rotation  of  each  of  the  seven 
perimeter  tiles  four  positions  clockwise.  Since  the 
perimeter  tiles  are  in  their  relative  positions  with 
respect  to  their  neighboring  tiles,  SEQ  is  0.  The 
Manhattan  Distance  calculates  the  shortest  route  to  the 
tile's  goal  position,  which  is  four  moves  for  tiles  2,  4, 

6,  and  8.  Tiles  1,  3,  and  5  need  only  two  moves  each, 
using  the  center  of  the  puzzle  as  a  shortcut.  Hence,  the 
Manhattan  Distance  becomes  22,  and  since  SEQ  is  0,  K3 
returns  the  value  22.  This  value  is  contrary  to  the 
behavior  depicted  at  this  distance  for  Gaschnig's  K3,  but 
is  consistent  with  the  results  observed  at  large  i  from  our 
K3  using  Method  B  (Figure  8.1). 


.v.v.v. 


V' 


AD-R172  *96 


UNCLASSIFIED 


AN  EMPIRICAL  STUDV  IN  THE  SIMULATION  OF  HEURISTIC  ERROR 
BEHAVIOR(U)  AIR  FORCE  INST  OF  TECH  HRIOHT-PATTERSON  AFB 
OH  S  R  HANSEN  1986  AFIT/CI/NR-86-184T 


Est  Dist  (K) 


Figure  8.1 
Profiles  for  K3 


KMAX(N) 


KMEANtN) 


KMIN(N) 


5  10  15  20  25 

N  -  distance  to  goal 
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Figure  8.1b  6-Puzzle  Profile  using  Method  B 


2.  PROFILING  CONCLUSIONS 

There  are  6  million  possible  combinations  of 
start/goal  pairs  for  the  6-Puzzle,  and  even  though  Method  A 
sampled  one  million  pairs,  there  was  a  tremendous  amount  of 
duplication  in  the  samples  selected  since  they  were  all 
picked  from  paths  en  route  to  a  goal.  Method  B  was  based 
on  fewer  comparisons,  picking  18,756  out  of  the  6  million 
for  a  selection  ratio  of  1  in  300,  but  was  derived  at 
random  from  100  different  search  trees,  giving  the  sample 
the  potential  for  more  breadth  and  hence,  more  opportunity 
to  find  the  underestimating  cases  we  observed  in  Figure 
8.1.  Also,  Method  B  was  not  as  prone  to  sample 
duplications. 

Gaschnig  (1979,  Pg  40)  stated  that:  '‘clearly  the  values 
obtained  by  this  [profile]  depend  on  the  number  of  samples 
on  which  the  [profile]  was  based...".  (Gaschnig' s  sample 
was  based  on  11,488  out  of  60  billion  possible 
combinations,  for  a  sample  selection  ratio  of  l  in  six 
million,  but  with  the  same  possibility  for  duplications  as 
our  Method  A) .  The  important  issue  is  not  only  choosing  a 
statistically  significant  number  of  values  to  sample  (which 
sample  A  did) ,  but  also  sampling  randomly  over  a  wide 
variety  from  the  total  available  (which  neither  Method  A 
nor  Gaschnig  did) .  Method  B  appears  to  succeed  on  both 
points,  providing  a  wider  sample  at  each  level  tailored  to 
the  shape  of  the  state  space,  and  appearing  to  capture  the 
average  statistical  behavior  of  the  heuristic. 


The  profiles  gathered  using  Method  B  are  referred  to  as 
"Source  Profiles"  since  their  data  was  chosen  as  the  basis 
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for  the  simulation  experiments  described  later.  (The 
Source  Profiles  for  Kl,  K2,  and  K3  are  presented  in  the 
next  section.)  This  name  is  used  to  distinguish  them  from 
the  "Run  Profiles"  (which  are  useful  for  reasons  described 
later)  that  depict  the  behavior  of  a  heuristic  in 
application,  and  tend  to  be  narrower  in  scope,  and 
therefore  not  useful  as  a  statistical  base  for  our  work. 

3.  DESCRIPTION  OF  SOURCE  PROFILE  GRAPHS 

Figures  8.2  through  8.4  present  a  graphical 
representation  of  the  source  profiles  using  two  distinct 
forms.  The  first  graph  provided  is  a  two-dimensional  view 
of  the  profile  showing  KMIN,  KMEAN,  KMAX,  and  includes 
standard  deviation  information.  A  line  called  "optimal" 
was  included  for  reference,  which  corresponds  to  the  entity 
K*,  or  the  value  that  a  perfectly  informed  heuristic  would 
return.  The  second  pair  of  graphs  shows  a  three- 
dimensional  view  of  the  same  heuristic's  profile.  The 
height  of  the  peaks  correspond  to  the  frequency  each  value 
was  encountered,  and  illustrates  the  actual  distribution  of 
the  values  that  the  heuristic  calculated.  Two  views  of  the 
same  graph  are  given,  one  from  the  front  and  one  from  the 
side  of  the  graph  in  order  to  improve  the  reader's 
perspective. 
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of  the  aggregate  values  produced  during  the  entire  run 
forms  a  "normal”  (bell-shaped)  curve  around  KMEAN.  This 
distribution  depends  on  the  standard  deviation  of  the 
values  in  the  Source  Profile,  and  distributes  the  values  so 
that  68%  fall  within  one  standard  deviation  of  KMEAN,  93% 
fall  within  two  standard  deviations,  97.8%  fall  within 
three  standard  deviations,  and  99.6%  fall  within  four 
standard  deviations  of  KMEAN. 

Note  that  the  emphasis  is  to  focus  on  KMEAN  rather  than 
KMIN  and  KMAX,  and  that  some  of  the  values  returned  may 
actually  be  outside  the  limits  KMIN  and  KMAX  in  the  Source 
Profile.  However,  this  contrived  heuristic  should  give  a 
good  statistical  reproduction  of  the  original  heuristic's 
profile,  and  therefore,  we  expect  that  it  should  also 
behave  identically  in  terms  of  number  of  nodes  expanded  and 
solution  path  length  found. 

2.  ACTUAL  DISTRIBUTION 

The  contrived  heuristic  based  on  this  distribution 
randomly  selects  values  according  to  the  frequency 
distribution  of  the  values  calculated  by  the  actual 
heuristic.  This  means  that  if  the  actual  heuristic 
calculated  the  value  10  at  distance  15  from  the  goal  23%  of 
the  time,  then  the  contrived  heuristic  should  return  that 
same  value  with  the  same  frequency  over  the  course  of  the 
run  when  the  goal  is  15  moves  away.  Values  from  this 
distribution  remain  within  the  bounds  of  KMIN  and  KMAX,  and 


KMEAN  remains  virtually  the  same  also. 

This  contrived  heuristic  should  reproduce  the  profile 
from  the  original  heuristic  very  closely  (at  least  as 
precisely  as  is  possible) ,  and  we  expect  it  to  duplicate 
the  performance  of  the  actual  heuristic. 

3.  WORST-CASE  DISTRIBUTION 

This  contrived  heuristic  provides  a  different 
distribution  than  those  described  above,  and  emulates  a 
worst-case  behavior  model  empirically.  This  distribution 
deliberately  attempts  to  divert  the  search  process  from  the 
ideal  path  to  the  goal  by  over-estimating  the  distance  of 
the  correct  nodes,  and  undercutting  (or  under-estimating) 
the  values  of  nodes  off  the  solution  path.  Since  the 
search  algorithm  selects  nodes  with  the  lowest  F-value, 
this  heuristic  encourages  the  search  to  wander  away  from 
the  best  solution  paths.  Our  implementation  of  this 
contrived  heuristic  returns  KMEAN  plus  one  standard 
deviation  if  the  node  being  evaluated  is  on  the  ideal 
solution  path,  and  returns  KMEAN  minus  one  standard 
deviation  otherwise. 

This  type  of  distribution  should  not  deviate  greatly 
from  the  limits  KMIN  and  KMAX  (depending  on  the  size  of 
ST_DEV)  but  will  definitely  alter  KMEAN.  This  will  provide 
empirical  verification  of  whether  Gaschnig's  definition  of 
equivalence  (based  only  on  KMIN  and  KMAX)  is  sufficient,  or 
whether  a  full  profile  is  necessary. 


4.  MECHANICAL  ISSUES 

Notice  that  the  simulated  heuristics  need  to  know  the 
actual  distance  to  the  goal  in  order  to  look  up  the 
appropriate  figures  in  the  profile,  which  would  be  cheating 
in  the  real  world  because  a  real  heuristic  wouldn't  know 
this  information.  In  fact,  this  is  precisely  what  the 
heuristic  is  supposed  to  be  telling  us. 

Since  the  simulation  requires  that  the  contrived 
heuristic  be  provided  with  the  actual  distance,  a  brief 
discussion  of  our  method  of  supplying  it  with  this 
information  is  in  order.  There  were  two  methods  available 
to  calculate  the  actual  minimal  distance  between  each 
node  during  the  search  and  the  goal.  One  method  was  to 
invoke  the  A*  algorithm  using  an  admissible  F  to  find  the 
distance,  but  this  was  unappealing  and  was  rejected  because 
every  node  generated  would  require  a  separate  execution  of 
A*,  which  was  being  used  already  to  solve  the  real  problem! 
Another  method  entailed  generating  the  state  space  with  the 
goal  as  the  root  using  the  A*  algorithm  with  an  admissible 
F.  Nodes  generated  during  the  solution  of  the  start/ goal 
pair  correspond  to  nodes  located  in  this  "inverted"  state 
space.  The  distance  to  the  goal  involved  searching  the 
inverted  state  space  for  that  node,  and  using  its  G  as  the 
value  I.  (Additional  pointers  were  added  to  the  node 
structure  in  order  to  provide  efficient  searching  of  the 
inverted  state  space.)  While  generating  the  state  space  is 


expensive  (2520  nodes) ,  this  overhead  could  be  virtually 
eliminated  using  a  carefully  chosen  sample,  where  all  of 
the  start/goal  pairs  use  a  common  goal  state.  In  this 
situation,  the  cost  of  building  the  inverted  state  space  is 
paid  only  once  for  all  198  problems  solved,  saving 
thousands  of  A*  executions. 

D.  EMPIRICAL  RESULTS 

Each  of  the  contrived  heuristics  described  above  was 
run  using  the  three  source  profiles  (Kl,  K2,  and  K3)  as  its 
information  source  (giving  nine  simulated  heuristics  in 
all) ,  using  the  Weighted  A*  Graph  Search  algorithm  with  G 
as  the  discriminator,  and  at  weights  0.2,  0.5,  0.6,  0.7, 
0.8,  0.9,  and  1.0.  In  order  to  distinguish  between  so  many 
heuristics,  we  extended  the  shorthand  naming  convention 
used  by  the  actual  heuristics  (i.e.  Kl,  K2,  K3) .  Table  8.3 
provides  a  legend  indicating  the  name  and  type  for  each  of 
the  heuristics  examined  in  this  empirical  study. 


TABLE  8.3 

Legend  of  Heuristic  Names 


Name 

Actual  or 
Contrived 

Description 

Kl 

Actual 

Number  of  tiles  misplaced 

K2 

Actual 

Manhattan  Distance 

K3 

Actual 

Enhanced  Manhattan  Distance 

K4 

Simulated  Kl 

Normal 

K5 

Simulated  K2 

Normal 

K6 

Simulated  K3 

Normal 

K7 

Simulated  Kl 

Histogram 

K8 

Simulated  K2 

Histogram 

K9 

Simulated  K3 

Histogram 

K10 

Simulated  Kl 

Worst-case 

Kll 

Simulated  K2 

Worst-case 

K12 

Simulated  K3 

Worst-case 

Since  there  are  three  contrived  heuristics  using  the 
profile  created  from  each  actual  heuristic,  it  is 
frequently  convenient  to  refer  to  this  set  by  group,  and 
the  following  notation  will  be  adopted:  the  entire  group  of 
four  (one  actual  heuristic  and  its  three  contrived 
couterparts)  will  be  referred  to  as  a  'Set',  with  Set  K1 
referring  to  the  heuristics  Kl,  K4,  K7,  and  K10  (since  K4, 
K7,  and  K10  borrow  Kl's  Source  Profile);  Set  K2 
refers  to  the  heuristics  K2,  K5,  K8,  and  Kll;  and  Set  K3 
refers  to  K3,  K6,  K9,  and  K12.  When  an  entire  set  is  not 
desired,  subsets  will  use  the  following  notation:  Kl/4/7 
meaning  heuristics  Kl,  K4  and  K7,  and  K2/5/8  meaning 
heuristics  K2,  K5,  and  K8,  etc. 

The  following  pages  contain  graphs  representing  the 
results  of  the  simulation  experiments.  First,  we  present 
the  run  profiles  generated  by  each  heuristic  during  the 
solution  of  the  problem  set,  followed  by  the  graphs 


depicting  their  performance  in  terms  of  number  of  nodes 
expanded  and  length  of  the  solution  path  found. 

1.  RUN  PROFILES 

Figures  8.5  through  8.16  show  the  performance  of  the  12 
heuristics  (Run  Profiles)  during  the  solution  of  the  198 
problem  pairs,  including  the  observed  KMIN,  KMEAN,  KMAX, 
and  standard  deviation  values.  (Note  that  a  run  profile  is 
distinct  from  the  Source  Profiles  used  by  the  simulation.) 
As  in  the  graphs  for  the  source  profiles,  the  run  profiles 
are  presented  using  a  two-dimensional  view  coupled  with  a 
pair  of  three-dimensional  views  to  assist  in  picturing  the 
distribution  of  the  values  returned  by  the  heuristic  during 
the  solution  of  the  problem  set.  The  run  profile  gives 
insight  into  how  the  contrived  heuristic  behaved  in 
comparison  to  the  Source  Profile  from  which  it  was  based. 

Figures  8.5,  8.6,  and  8.7  show  the  run  profiles  for  the 
actual  heuristics  Kl,  K2,  and  K3.  Figures  8.8,  8.9,  and 
8.10  show  the  run  profiles  for  the  contrived  heuristics  K4, 
K5,  and  K6,  which  were  based  on  a  normal  distribution  about 
the  Source  Profile  of  Kl,  K2,  and  K3,  respectively,  and 
show  the  'bell'  curve  of  values  rising  from  the  plane  in 
the  three  dimensional  graphs.  Comparing  Figures  8.8a, 

8.9a,  and  8.10a  to  their  Source  Profile  counterparts 
(Figures  8.2a,  8.3a,  and  8.4a,  respectively),  one  can 
observe  that  although  KMEAN  is  closely  duplicated,  KMIN  and 
KMAX  are  not  the  same. 


The  run  profiles  for  contrived  heuristics  K7,  K8,  and 
K9  (Figures  8.11,  8.12,  and  8.13)  are  interesting  because 
they  duplicated  their  Source  Profiles  (see  Figures  8.2 
through  8.4),  indicating  that  they  did  indeed  mimick  the 
aggregate  statistical  behavior  of  the  actual  heuristic. 

The  run  profiles  for  K10,  Kll,  and  K12  (Figures  8.14, 
8.15,  and  8.16)  show  the  peaks  surrounding  KMEAN,  and  show 
by  their  height  that  most  of  the  values  returned  were  for 
the  "off-the-path"  nodes  and  that  very  few  "on-path"  nodes 
were  expanded.  This  is  confirmed  by  observing  the  two- 
dimensional  versions,  where  KMEAN  is  almost  superimposed 
on  top  of  KMIN,  yet  KMAX  hovers  high  above. 
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Figure  8.5 
Run  Profile  for  K1 
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Figure  8.5a  Two-Dimensional  View 
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Figure  8.5b  Front  Perspective 
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Figure  8.8  (Cont) 
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Figure  8.11 
Run  Profile  for  K7 
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Figure  8.12 
Run  Profile  for  K8 
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Run  Profile  for  K9 
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Figure  8 . 15 
Run  Profile  for  Kll 
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2.  COMPLEXITY  PERFORMANCE  RESULTS 

Figures  8.17  through  8.22  graphically  present  the 
performance  results  of  the  simulation  experiments  in  terms 
of  nodes  expanded  versus  the  depth  of  the  goal  (Figures 
8.17  to  8.19)  and  solution  path  length  versus  depth  of  goal 
(Figures  8.20  to  8.22).  In  order  to  save  space  and  enhance 
comparison  between  the  contrived  heuristics  and  their 
actual  counterparts,  each  graph  contains  four  performance 
curves:  one  for  each  of  the  three  contrived  heuristics,  and 
a  fourth  curve  for  the  actual  heuristic  from  whose  profile 
the  contrived  heuristics  were  based. 

Each  plot  presents  the  performance  of  the  set  of  four 
heuristics  at  one  weight.  The  results  are  rich  with 
information,  so  only  the  most  significant  phenomenon  will 
be  highlighted.  First  we  focus  on  the  trends  observed  from 
the  Normal  and  Actual  distributions  because  their 
behavior  was  similar  enough  to  one  another  to  merit  being 
considered  together.  We  will  then  examine  Worst-Case 
performance  separately. 


a.  NORMAL  AND  ACTUAL  DISTRIBUTION  RESULTS 


The  performances  of  Normal  and  Actual  contrived 
heuristics  were  quite  similar  to  each  other.  In  the  graphs 
that  follow,  their  curves  tend  to  follow  each  other  around 
rather  independently  of  the  other  curves  on  the  graph.  For 
Set  K1  and  Set  K2  at  the  lower  weights  (Figures  8.17a, 
8.17b,  and  8.18a),  and  for  Set  K3  at  all  weights  (Figures 


8.19a-f)  they  also  coincide  very  closely  with  the 
performance  of  the  actual  heuristic.  However,  when  one 
deviates,  the  other  is  found  nearby,  as  shown  in  Figures 
8.18c-f  and  8.17e-f. 

1.  RANGE  OF  EFFECTIVENESS 

The  results  show  that  even  when  the  simulation  was  not 
very  good,  there  were  always  ranges  of  N  in  which  the 
simulation  was  very  effective.  For  Kl/4/7,  this  range 
appears  to  be  for  levels  1-5  (Figures  8.17a-f);  for 
K2/5/8 ,  the  range  appears  to  be  from  1-7  (Figures  8.18a-f); 
and  for  K3/6/9,  the  simulation  is  not  exact  at  any  single  N 
but  fairly  close  over  all  N  (Figures  8.19a-f).  Comparing 
these  ranges  against  the  source  profiles  (Figures  8. 2-8. 4) 
exposes  some  interesting  coincidences:  Source  Profile  K1 
(Figure  8.2)  shows  that  KMEAN  levels  off  at  5,  and  Source 
Profile  K2  (Figure  8.3)  shows  KMEAN  leveing  off  at  around 
8,  but  in  Source  Profile  K3  (Figure  8.4),  KMEAN  never 
really  levels  off  at  all. 

This  suggests  that  the  actual  heuristics  have  a  range 
within  which  they  are  very  effective  in  predicting  the 
distance  remaining  to  the  goal,  and  beyond  that  range, 
their  estimate  is  no  longer  ’intelligent'  but  merely  a 
bounded  guess.  In  a  sense,  they  display  a 
•nearsightedness'.  K1  is  the  worst,  never  exceeding  the 
value  6,  and  whose  mean  levels  off  at  5.  When  the  goal  is 
beyond  that  range,  K1  is  incapable  of  providing  a 
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meaningful  estimate  and  instead ,  gives  a  guess  that 
corresponds  to  its  range  or  the  limit  of  its  sight.  K2 
has  somewhat  better  vision,  but  it  too  levels  off  at  around 
8  or  9,  meaning  for  many  of  the  more  distant  goals,  its 
average  response  is  still  shortsighted  and  fairly 
meaningless . 

K3  displays  a  different  behavior  altogether.  Although 
KMEAN  always  overestimates  K*,  it  does  so  in  a 
monotonically  increasing  fashion.  When  the  goal  is  20 
moves  away,  K3  not  only  has  the  capacity  to  provide  a 
meaningful  response,  where  K1  and  K1  cannot,  it  also 
consistently  lowers  its  estimate  as  it  gets  closer  to  the 
goal. 

The  simulation  within  the  range  of  each  heuristic  is 
good  at  all  weights.  Beyond  this  range,  since  the  actual 
heuristic  doesn't  return  meaningful  values,  attempting  to 
simulate  this  proves  to  be  ineffective. 


2.  TIMING 

Figures  8.11  through  8.13  show  the  run  profiles  based 
on  the  contrived  heuristics  using  the  actual  frequency 
distribution  (K7,  K8,  and  K9) .  These  are  exact  duplicates 
of  the  Source  Profiles  (compare  to  Figures  8.2  to  8.4). 
This  means  that  the  values  generated  by  the  contrived 
heuristics  during  the  solution  of  the  198  problems  had 
aggregate  performances  that  were  exactly  like  the  actual 
heuristic's  profile,  and  yet  the  simulation  was  not  so 
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exact  (Figures  8.17f,  8.18c-f).  What  caused  these 
performance  variations  when  the  profiles  were  so  accurately 
preserved?  Timing. 

While  the  profile  permits  precise  duplication  of  the 
distribution  of  the  actual  heuristic  for  the  entire  run,  it 
cannot  guarantee  that  the  simulated  heuristic  will  respond 
with  the  same  value  that  the  actual  heuristic  would  have  at 
any  given  node.  The  contrived  heuristic  may  have  terrible 
timing  and  give  high  values  when  the  actual  heuristic  would 
have  given  low  values,  or  it  may  have  impeccable  timing, 
returning  different  values  but  doing  so  in  such  a 
combination  that  the  search  process  is  led  directly  to  the 
goal.  This  could  be  why  K2/5/8  at  the  higher  weights 
perform  so  much  better  than  the  actual  heuristic  (Figures 
8 . 18d-f ) .  While  timing  doesn't  affect  K3/6/9  as 
dramatically,  this  may  well  be  the  cause  of  the  occasional 
minor  deviations  observed  in  their  graphs  (Figures  8.19a- 


3.  WEIGHT 

Weight  also  has  an  impact  on  the  effectiveness  of  the 
simulation,  where  generally  we  observe  very  similar  results 
at  the  lower  weights,  but  becoming  less  effective  as  the 
weight  increases.  This  probably  isn't  too  significant 
because  at  low  weights  the  search  is  essentially  breadth- 
oriented  and  the  H  component  has  only  a  minor  impact  on  the 
direction  of  the  search  pattern. 
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Specifically,  simulations  for  Kl/4/7  imitated  the 
actual  heuristic's  performance  very  closely  over  all  N  when 
weight  was  less  than  0.8  (Figures  8.17a-d),  but  when  the 
weight  exceeded  0.8,  the  results  deviated  significantly 
beyond  the  range  of  the  heuristic  (Figures  8.17e  and  f ) . 

For  K2/5/8 ,  the  simulation  breaks  up  beyond  the  heuristic's 
range  when  weights  greater  than  0.6  are  used  (Figures  8.17c 
f ) .  Simulation  effectiveness  for  K3/6/9  (Figures  8.18a-f) 
seems  unaffected  by  weight. 


b.  WORST-CASE  DISTRIBUTION  RESULTS 

The  Worst-Case  simulation  produced  mixed  results.  We 
anticipated  seeing  more  nodes  expanded  than  the  actual 
heuristic  produced,  since  the  characteristic  of  this 
contrived  heuristic  is  to  throw  the  search  pattern  away 
from  the  true  path.  For  Set  Kl,  we  observed  that  K10 
provided  an  upper  bound  on  the  other  performance  curves  in 
terms  of  nodes  expanded  at  all  weights  less  than  0.8  (See 
Figures  8.17a-e).  However,  at  weight  0.9  (Figure  8.17f), 
the  growth  at  levels  (N)  5-6  was  super-exponential,  jumping 
from  about  7  nodes  expanded  to  over  100,  but  leveling 
completely  for  N>6. 

Kll  provides  similar  behavior  for  Set  K2,  giving  an 
upper  bound  for  nodes  expanded  for  weights  less  than  0.8 
(Figures  8.18a-d),  and,  like  K10,  takes  a  sudden  steep  rise 
at  weight  0.8  and  0.9  (Figures  8.18e-f),  leveling  off  after 
that.  An  interesting  phenomenon  is  then  observed:  Kll 
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becomes  a  lower  bound  for  nodes  expanded  for  Set  K2  at 
levels  12  to  19! 

K12  shows  the  same  trend,  giving  an  upper  bound  at 
weight  0.2  (Figure  8.19a),  and  for  weights  0.5  through  0.9 
(Figures  8.19b-f)  taking  the  sudden,  steep  rise  that 
subsequently  levels  off,  and  like  Kll,  becomes  a  lower 
bound  for  the  deeper  levels  of  N.  In  fact,  it  becomes 
near-optimal . 

The  upper  bound  results  were  expected,  but  the  leveling 
off  and  lower  bound  results  are  curious  and  deserve  some 
explanation.  The  steep  rise  indicates  that  the  search  is 
nearly  Breadth-First  in  nature.  The  distribution  adds  or 
subtracts  one  standard  deviation  from  KMEAN,  which  at  the 
lower  values  of  i,  represents  a  sizeable  portion  of  the  F- 
value.  As  i  increases,  the  impact  of  one  standard 
deviation  added  to  or  taken  away  from  KMEAN  becomes  less 
significant,  and  the  search  becomes  increasingly  effective. 

Figures  8.20  through  8.22  show  the  path  lengths,  and 
indicate  the  solution  path  lengths  for  K10,  Kll,  and  K12  at 
the  middle  values  of  N  were  very  large,  tapering  off  to 
almost  ideal  as  N  increased.  This  suggests  that  where  the 
search  expanded  the  full  tree,  the  ideal  path  was  avoided 
(because  it  was  forced  to  by  the  nature  of  the  contrived 
heuristic) ,  and  longer,  alternative  detours  were  taken  to 
arrive  at  the  goal. 


Figure  8 . 17b 
Simulation  Results 
Heuristics  Kl,  K4,  K7,  K10 
Weight  =  0.5 
XMEAN 


LEGEND 
•  =  optimal 
■  =  breadth  first 
a  =  XMean.K  1,0.50 
a  =  XMean.K  4,0.50 
a  =  XMean.K  7,0.50 
a  =  XMean,K10,0.50 


•  •  • 


J3  k>  0  /  0 . 1  0.2  0  3  0.4  0  5  0  8  0  .7 


Depth  of  Goal  (N) 


R 


(X)  pa 


-  ^  >  •>  -j, 


Jf“Jr?wTT 


— ssS®-*^ 

"eicTht'.*5'  *8<  RU 
XttEAN 


"O 

C 

CO  p 

9-  JO2 


LEGEND 
ir^injal 
o  ^  &adt^  first 

®  =  X$e|n’?  In  90 

::fjfe-5iS; *sg 


■  ■  •-  a 


*  * * » 


CO 

C?  1ni 

~o  Ju 


•v.. 

w 


\nr 


Figure  8.19a 

K=urii??«“jn  Results 

weight'.*?.'*9'  *12 

XMEAN 


legend 

-  optima] 

=  {j£adth  first 

-  yu  an’K  3,0.20 

:  JJ/ean’K  6  0.20 

XMean.K  9,0.26 

^Wean,Kl2,0.20 


DeP^  of  Goal  (N) 


Expanded 


242 


M-wSE  8-19f 

W9ight'=  o'aK9'  K12 

XJffiAB 


s-  J0“ 

kj 

cc 

•§  J0J 

;§ 


•  _  f  egejvd 
-optima] 
0;tea<<tJi  firsl 
s  -  ?JJean.K  3,0  9I) 

•;nfcXf$» 


£3 

.^v?l 


A  a 


DePth  Of  Goa]  (N) 


;w3 

.wl 

•  «v.  J 

Wj 


e>via 


vvTjl*%.  oV  v.  V 


saa 


•;.v^ 

$s*.4 

^>*,n3 


E.  CONCLUSIONS 


Simulation  by  statistical  profile  provides  interesting 
and  varied  behavior.  There  are  four  important  contributing 
factors  involved  in  achieving  a  good  simulation:  (1)  Range, 
(2)  Distribution,  (3)  Weight,  and  (4)  Timing.  Our  choices 
of  contrived  heuristics  provided  a  good  filter  for  some  of 
these  items.  Heuristics  Kl,  K2,  and  K3  vary  in  Range;  the 
simulation  using  the  actual  distribution  (K7,  K8,  and  K9) 
eliminated  variations  in  distribution,  and  permitted 
focusing  on  effects  of  Timing;  K10,  Kll,  and  K12  eliminated 
timing  variations  and  focused  exclusively  on  the  effects  of 
distribution. 

Range  is  the  distance  at  which  the  heuristic  can  see 
goal;  it  is  inherent  to  the  heuristic  and  cannot  be 
altered.  Profiling  gives  insight  into  what  a  heuristic's 
range  is.  Within  range,  simulation  was  excellent  at  all 
weights,  assuming  a  fair  distribution  was  used  (meaning  the 
contrived  heuristic  makes  a  serious  attempt  to  reproduce 
the  Source  Profile.  Outside  a  heuristic's  range,  the 
simulation  is  effective  at  low  weights,  but  not  effective 
at  high  weights. 

Distribution  is  important  to  simulation,  and  has  the 
advantage  that  it  can  be  altered.  For  example,  the  Worst- 
Case  distribution  gave  dramatically  different  results  than 
K4-K9 ,  and  only  the  distribution  was  different.  While  K7-9 
provided  the  most  exact  distribution  possible  from  the 
Source  Profiles,  they  didn't  perform  as  closely  to  the 


actual  heuristics  as  anticipated.  What  the  distribution 
cannot  capture  is  Timing.  If  the  profiles  could  be 
augmented  somehow  to  provide  timing  information,  we  expect 
that  the  simulation  would  be  exact,  regardless  of  the  Range 
or  Weight.  Unfortunately,  we  know  of  no  reasonable  way  to 
capture  this  level  of  detail. 

Simulation  works  well  at  the  lower  weights  for  all  of 
the  contrived  heuristics.  It  tends  to  soften  the  impact  of 
poor  timing  and  range,  and  to  some  extent,  distribution  by 
giving  it  less  H  component  in  the  F-value. 

The  results  show  that  heuristics  cannot  be  considered 
equivalent  purely  on  the  basis  of  having  identical  KMIN  and 
KMAX  bounding  functions.  Even  complete  and  exact  profile 
duplication  does  not  guarantee  identical  performance,  or 
even  close  performance  (as  in  K2  at  W=0.8  and  0.9,  Figures 
8.18e-f)  because  the  timing  may  differ  dramatically.  In 
addition  to  sharing  the  functions  KMIN,  KMEAN,  and  KMAX,  it 
appears  that  having  a  small  standard  deviation  reduces  the 
variances  caused  by  Timing. 

Therefore,  to  ensure  good  simulation:  (1)  use 
heuristics  with  good  range,  tight  standard  deviation,  and 
possessing  a  fair  distribution;  (2)  for  Heuristics  without 
good  range,  lowering  the  weight  will  improve  simulation; 
and  finally,  (3)  if  the  standard  deviation  of  the  Source 
Profile  is  large,  Timing  will  cause  at  least  minor 
variations  no  matter  the  range  of  the  heuristic  (as  in  K3) . 
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N.S  A  A  '.A  N. 


The  technique  of  simulation  is  appealing  and  promising 
because  the  researcher  can  vary  the  distribution  in  order 
to  focus  on  specific  behavior.  Our  profiles  were  based  on 
actual  heuristics,  but  there  is  no  reason  they  couldn't  be 
represented  as  a  table  of  values  that  the  researcher  could 
alter.  Viewed  in  this  manner,  the  researcher  could 
contrive  profiles  and  thereby  control  Range  and  standard 
deviation  (which  we  could  not  do) ,  in  addition  to 
distribution  and  weight  (which  we  were  limited  to) . 
Therefore,  this  mechanism  appears  useful  in  modelling 
specific  heuristic  behavior  in  controlled  circumstances 
unlike  those  found  in  real  life  (such  as  a  heuristic  with 
linear  error  and  no  standard  deviation) ,  and  empirically 
studying  the  results. 

Besides  modelling  heuristics,  future  research  could 
also  be  conducted  into  the  equivalence  of  heuristics  whose 
domains  are  different.  If  certain  conditions  permit  a 
profile  to  simulate  the  actual  heuristic  keeping  the  domain 
fixed,  if  these  conditions  are  preserved  in  another  domain, 
does  the  profile  retain  its  ability  there?  And,  if 
profiles  can  be  used  to  esablish  the  equivalence  of 
heuristics  in  the  same  domain,  are  there  conditions  under 
which  the  equality  of  profiles  from  distinct  domains 
establish  equivalence  also?  That  is,  if  a  heuristic  in 
the  6-puzzle  has  the  same  profile  as  a  heuristic  in  the 
checkerboard  domain,  can  they  be  called  equivalent?  Or, 
can  a  heuristic  be  moved  nin  spirit"  by  its  statistical 


profile  to  another  domain  and  still  have  power? 

Pearl  (1984,  Chapters  6  and  7)  investigated  the 
behavior  of  UA*  on  the  assumption  that  H(n)  is  a  random 
variable  whose  distribution  depends  only  upon  G*(n)  (which 
is  the  actual  minimal  distance  from  the  root  to  the  node  n, 
and  closely  related  to  what  we  have  called  G) ,  and  H*(n) 
(what  we  have  called  i,  or  the  actual  distance  remaining  to 
the  goal),  and  that  H(n)  is  independant  for  each  node.  Our 
study  sheds  insight  into  the  plausibility  of  Pearl's 
assumptions,  permitting  the  general  search  problem  to  be 
viewed  as  consisting  of  three  independant  variable 
components:  (1)  the  search  algorithm  (such  as  A*,  Weighted 
A*,  etc.),  (2)  the  graph  or  domain  (such  as  the  6-Puzzle, 
8-Puzzle,  checkerboard,  etc.),  and  (3)  the  heuristic  (a 
random  variable) .  If  random  variables  can  be  used  under 
the  proper  conditions  to  simulate  heuristics,  as  we  have 
shown,  then  we  can  vary  any  of  the  three  above 
independently  of  the  others  and  thereby  attempt  to  get  real 
insight  into  the  general  searching  process. 

Using  our  tools  and  techniques,  a  world  of  research 
possibilities  have  been  opened  up,  and  we  hope  that  they 
will  be  beneficial  when  applied  to  answering  these  issues. 
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* 

APPENDIX  A  * 

* 

Gonoral  Boads  World  Tools  Nodulss  * 

* 

*********************************************************************** 


(* 

* 

•) 


Utilitiss  Modulo  (t4.0  23-Fab-86  AJC/SRH) 


modulo  utilitiss  (input,  output); 
const 

max.positions  =  10; 
no.puzzla  =  ’  ’ ; 

cantor  =  1; 
first  =  2; 

typs 


positions  *  cantor. .max.positions; 

puzzle. itat#  3  packsd  array  [positions]  of  char; 

nodo_ptr  =  "puzzle.node ; 

nsighbor.nods.ptr  =  'neighbor .nods ; 
neighbor .nods  =  rscord 

noighbor  :  nods.ptr ; 
next  :  nsighbor.nods.ptr; 
and; 

puzzlo.nods  *  rscord 

stats  :  puzzlo.otats ; 

loft,  right,  sort.lsft,  sort.right  :  nods.ptr; 

noighbors  :  nsighbor.nods.ptr; 

parsnt  ;  nods.ptr; 

g.Talua,  h.value  :  intogsr; 

f .valus  ;  rsal; 

and; 

(* 

*  puzzls  I/O  functions 
*) 

[global] 

procsdurs  rsad.atats  (var  s  :  puzzlo.stato ;  n  :  intogsr); 
Tar 

i,  p  :  intogsr; 
ch  :  char; 
bsgin 
rspsat 

road  (ch) ; 
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until  ch  =  ’('; 
for  i  :»  center  to  n  do 
begin 
reed  (p) ; 
e[i]  :=  chr  (p) ; 
end; 
repeat 

read  (ch) ; 

until  (ch  *  ')')  or  eoln; 
if  ch  <>  ’ ) ’  then 
begin 

vriteln  (’  ***  Error  -  Inproper  format  on  state  input.') 

halt; 

end; 

end; 

[global] 

procedure  print. state  (a  :  puzzle.atate ;  n  :  integer); 
var 

i  :  integer ; 
begin 

write  ('  C); 

for  i  :=  center  to  n  do 

write  (ord(s[i]) :2,  ‘  *) ; 
write  (')  '); 
end; 


node  and  list  primitives 

[global] 

function  create.puzzle.node  (s  ;  puzzle.atate):  node.ptr; 
var 

n  :  node.ptr; 
begin 
new  (n) ; 
n‘. state  : =  s; 
n'.left  :=  nil; 
n*. right  :=  nil; 
n'.sort.left  :=  nil; 
n* . sort.right  : =  nil ; 
n* . neighbors  :=  nil; 
n". parent  :=  nil; 
n* .g_ value  : =  0; 
n".h_ value  :=  0; 
n* ,f .value  :=  0.0; 
create.puzzle.node  :=  n; 
end; 

[global] 

procedure  free.node  (var  n  :  node.ptr) ; 


a,  p  :  neighbor .node.ptr ; 
begin 

a  : =  n*. neighbors; 
while  a  <>  nil  do 
begin 
p  :«  a; 
a  : *  a* . next ; 
diepoae  (p) ; 
end; 

diepoee  (n) ; 
end; 

[global] 

procedure  create.eapty.list  (war  liat  :  node.ptr) ; 
begin 

list  :  =  create_puzzle_node  (no.puzzle) ; 

liat*. left  list; 

list*. right  : =  list; 

liet'.f. value  :=  0.0; 

list* .g_ value  :=  0; 

end; 

[global] 

function  ia.enpty  (list  :  node.ptr)  :  boolean; 
begin 

ia.enpty  (liat* .g_ value  =  0); 
and; 

[global] 

procedure  place.on_end.of .list  (p,  list  :  node.ptr); 
begin 

p* . left  :=  list". left; 
p". right  : =  list; 
list*. left*. right  p; 
list*. left  : =  p; 

list* .g.value  :*  list" .g.value  +  1; 
end; 

[global] 

procedure  place. in.ascending.order  (p,  list  :  node.ptr) ; 
var 

q,  r  :  node.ptr; 
begin 

q  :=  list* .right; 

while  (q  <>  list)  and  (p*.f .value  >  q*.f .value)  do 
q  : =  q* . right ; 
r  ;=  q* . left ; 
r* .right  :=  p; 
q* . left  :=  p; 
p*.left  :=  r; 
p*. right  :=  q; 

list* .g.value  :=  list* .g.value  +  1; 
end; 
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[global] 

function  raaoTa.froa.front.of .list  (list  :  noda.ptr)  :  noda.ptr; 

Tar 

p,  q  :  noda.ptr; 
bogin 

if  list* .g.Talua  =  0  than 

raaoTa.froa.front.of .liat  :=  nil 

olae 

bagin 

p  :=  liat*. right; 
q  :=  p*. right; 
q'.left  :=  liat; 
list* . right  : =  q; 
p'.laft  ;=  nil; 
p*. right  :=  nil; 

liat* .g.Talua  : =  liat* . g.value  -  1; 
raaoTa.froa.front.of _liat  : =  p; 
and; 

and; 

[global] 

procadura  dalata.froa.liat  (p,  liat  ;  noda.ptr); 

Tar 

1,  r  :  noda.ptr; 
bagin 

if  p  <>  nil  than 
bagin 

1  :=  p'.laft; 
r  : =  p* . right ; 

1* .right  :=  r ; 
r“.laft  :=  1; 
p“.laft  :=  nil; 
p*. right  :  =  nil; 

liat* .g.Talua  : =  liat* . g.Talua  -  1; 
and; 

and; 

[global] 

procadura  fraa.liat  (Tar  liat  :  noda.ptr) ; 

Tar 

p  :  noda.ptr ; 
bagin 

whila  liat* . g.value  >  0  do 
bagin 

p  :=  ranoTa.froB.front.of .liat  (liat); 

fraa.noda  (p) ; 

and; 

fraa.noda  (liat) ; 
and; 


*  aaarch  traa  priaitiTaa 


<—F 


procedure  inaart_in.tr**  (n  :  noda.ptr;  var  trae  :  noda.ptr) 

var 

q.  r  :  noda.ptr; 
bag  in 

if  tra*  =  nil  then 
tra*  : =  n 

ala* 

begin 
q  : =  tra* ; 


while  q  <> 
begin 

nil  do 

r  :=  q 

; 

if  n*  . 

atata  <  q*. atata 

q 

ala* 

:  =  q* . aort.laf t 

q 

and; 

:=  q* . aort.right 

if  n* . atat*  <  r* . atata  than 
r“ . aort.laf t  : =  n 

ala* 

r* . aort .right  :=  n; 

and; 

and; 

[global] 

function  find_in.tr**  (n,  tra*  :  noda.ptr)  :  noda.ptr; 
Tar 

p  :  noda.ptr; 
found  :  boolean; 
begin 

found  : =  falaa; 
p  :=  tree; 

while  (p  <>  nil)  and  (not  found)  do 
begin 

if  p*. atata  =  n*. atata  than 
found  :  =  true 

ala* 

if  n*. atata  <  p*. atata  than 
p  :=  p'.aort.left 

ala* 

p  :=  p". aort.right ; 

and; 

if  found  than 

find_in.tr**  :  =  p 

ala* 

find_in.tr**  :=  nil; 

and; 


[global] 

function  find_atata_in.tr**  (a  :  puzzla.atat* ; 
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tree  :  node.ptr)  :  node.ptr; 


p  :  node.ptr; 
found  :  boolean ; 
begin 

found  :=  false; 
p  : =  tree ; 

while  (p  <>  nil)  and  (not  found)  do 
begin 

if  p“. state  =  a  then 
found  : =  true 

else 

if  a  <  p*. state  then 
p  :=  p~.sort_left 

else 

p  :=  p~ . sort.right ; 

end; 

if  found  then 

find_state.in.tree  :=  p 

elee 

f ind.state_in.tree  :=  nil; 

end; 

[global] 

procedure  free.binary.tree  (war  t  ;  node.ptr) ; 
war 

p.  q  :  node.ptr; 
begin 

if  t  <>  nil  then 
begin 

free.binary.tree  (t~ . sort.lef t) ; 
free.binary.tree  (t~ . sort.right ) ; 
free.node  (t) ; 
t  :=  nil; 

end; 

end; 

[global] 

procedure  f ree.graph  (war  g  :  node.ptr) ; 
war 

n  :  neighbor.node.ptr ; 
begin 

if  (g* . state  <>  no. puzzle)  then 
begin 

g~. state  :=  no.puzzle; 
n  :=  g“. neighbors; 
while  n  <>  nil  do 
begin 

f ree.graph  (n* .neighbor) ; 

n  : =  n'.next; 

end; 

free.node  (g) ; 
end; 
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t 


! 


i 


<* 

* 

*) 

I 

J  modulo  control  (input,  output); 

}  conot 

(* 

*  From  UTILITIES  Import  CONST 
*) 

max.positions  =  10; 
no.puzzle  =  '  * ; 

cantor  =  1 ; 
first  =  2; 

:  (* 

*  CONTROL 
*) 

■ax. links  =  max.positiono; 
max. levels  *  90; 


typo 

(* 

*  From  UTILITIES  Import  TYPE 
•) 

positions  -  contor. .max.positiono; 
puzzlo.stato  =  packod  array  [positions]  of  char; 

node.ptr  =  ‘puzzle.node : 

noighbor.noda.ptr  =  'noighbor .nods ; 
noighbor.noda  =  rocord 

noighbor  :  nodo.ptr; 
noxt  :  noighbor.noda.ptr; 
and; 

puzzla.noda  -  rocord 

stato  :  puzzlo.stato; 

loft,  right,  oort.loft,  sort.right  :  nodo.ptr; 

noighbors  :  noighbor.noda.ptr; 

par ant  :  nodo.ptr; 

g_ value,  h_ value  :  intogor; 

f .value  :  real; 

end; 


(* 

*  CONTROL 
») 

link.array  =  array [first . .max.positiono]  of  boolean; 

level.record  =  record 

count  :  integer ; 


list  :  node.ptr; 
•nd; 


level.array  =  array  [0. .aax.levels]  of  level.record; 

graph.descriptsr  =  rscord 

dapth,  gsnaratad,  expanded  :  integer; 

level  :  level.array; 

end; 

results.descriptor  =  record 
solved  :  boolean; 

path.length,  min.path. length,  generated, 
expanded,  heuristic  :  integer; 
weight  :  real; 
start,  goal  :  node.ptr; 
end; 


puzzle  characteristics 


var 

nun.poaitions,  num. links  :  [global]  integer; 
link  :  [global]  link.array; 


Fro*  UTILITIES  Import: 


[external] 

procedure  print.state  (s  :  puzzle.state ;  n  :  integer); 
external ; 

[external] 

function  create.puzzle.node  (p  :  puzzle.state):  node.ptr; 
external; 

[external] 

procedure  free.node  (var  n  :  node.ptr) ; 
external ; 

[external] 

procedure  create.eapty.list  (var  list  :  node.ptr) ; 
external ; 

[external] 

function  is.empty  (list  :  node.ptr)  :  boolean; 
external ; 

[external] 

procedure  place. on.end.of .list  (p,  list  :  node.ptr); 
external; 


[external] 

procedure  place. in.aacending.order  (p,  liat  :  noda.ptr) ; 
external ; 

[external] 

function  remoTe.from.front.of _liat  (liat  :  noda.ptr)  :  noda.ptr; 
axtarnal; 

[axtarnal] 

procadura  dalata_fron_list  (p,  liat  :  noda.ptr); 
axtarnal ; 

[external] 

procedure  free. liat  (var  liat  :  noda.ptr) ; 
axtarnal ; 

[external] 

procadura  inaart.in.traa  (n  :  noda.ptr;  var  tree  :  noda.ptr); 
axtarnal; 

[axtarnal] 

function  find.in.traa  (n,  tree  :  noda.ptr)  :  noda.ptr; 
external ; 


From  HEURISTIC  Inport: 


[external] 

procedure  initialize.heuristica ; 
axtarnal; 

[axtarnal] 

function  aatinatad.diatanca  (heuriatic  :  integer; 

currant  :  noda.ptr; 

goal  :  puzzle.atate; 

c.atar  :  integer)  :  integer; 

axtarnal ; 


CONTROL 


[global] 

procadura  initializa.control  (np,  nl  :  integer;  1  :  link.array) ; 
▼ar 

i  :  poaitions; 
begin 

nun.poaitiona  : =  np; 
nua.linka  :=  nl; 
for  i  ; =  firat  to  np  do 
link[i]  :=  l[i] ; 


initiftliza_hAuristi.es; 

and; 
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procadura  generate.successors  (p  ;  puzzla.stata; 

successor.list  ;  noda.ptr) ; 


rar 


i  :  integer; 
blnk  :  char ; 


procadura  add.successor  (source,  dast  :  intagar); 

▼ar 

c  :  noda.ptr ; 
bagin 

c  :=  craate_puzzle_noda  (p) ; 

c* . state [dast]  :  =  c*. state [source] ; 

c' . state [source]  :=  blnk; 

place_on_end_of_list  (c,  s ucceasor.list) ; 

and; 

bagin 

blnk  :=  chr(O) ; 

for  i  : =  first  to  nua.positions  do 
bagin 

if  p[i]  <>  blnk  then 
bagin 

if  (i  =  nua.positiona)  and  (p[first]  =  blnk)  than 
add_auceeaaor  (nua.positiona ,  first); 
if  (i  <>  aua_poaitiona)  and  (p[i+lj  *  blnk)  than 
add.successor  (i,  i+1); 

if  (i  =  first)  and  (p[num_positions]  =  blnk)  then 
add.auccesaor  (first,  nua.positions) ; 
if  (i  <>  first)  and  (p[i-l]  -  blnk)  than 
add_succaasor  (i,  i-1); 
if  (p [center]  *  blnk)  and  link[i]  than 
add.succassor  (i,  canter); 

and; 

and; 

if  p [center]  <>  blnk  than 
bagin 

for  i  :=  first  to  nua.positions  do 
if  link[i]  and  (p [i]  =  blnk)  then 
add.succassor  (canter,  i) ; 

and; 

end; 

procadura  add.naighbor  (neighbor,  currant  :  noda.ptr); 

▼ar 

n  :  neighbor .noda.ptr; 
bagin 
naa  (n) ; 

n". neighbor  :=  neighbor; 
n'.naxt  :=  currant* .neighbors ; 
currant* .neighbors  :=  n; 
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•nd; 

(* 

*  Search  tree  generation 
*) 

[global] 

procedure  initialize.graph.descriptor  (yar  g  :  graph.deacriptor) ; 

var 

i  :  integer ; 
begin 

g . depth  : =  0 ; 
g. generated  :=  0; 
g . expanded  : =  0 ; 
for  i  :=  0  to  max.levela  do 
begin 


g. level [i] .count  :=  0; 
create.empty.list  (g . level [i] . liat) ; 
end; 


[global] 

procedure  generate.graph  (start.atate  :  puzzle.stata ; 

var  aearch.tree,  graph  :  node.ptr; 
var  g  :  graph.deacriptor) ; 


■tart,  currant,  c,  p  :  node.ptr; 
open,  aucceeeor.liet  :  node.ptr; 
depth,  nodae.generated,  nodee.expanded  :  integer; 

begin 

create.empty.liet  (open) ; 
create. enpty.liat  (eucceaaor.liat) ; 
aearch.tree  :  =  nil; 
graph  : =  nil; 

nodea.generated  ;=  1; 
nodee.expanded  : =  0; 

■tart  :=  create.puzzle.node  (start.atate) ; 
atart* .g.value  ;=  0; 
start* . f .value  : =  0.0; 
inaert.in.tree  (atart,  aearch.tree); 
graph  ; =  start ; 

place.in.ascending.order  (start,  open); 


while  not  ia.enpty  (open)  do 
begin 

current  :  =  reraove.f rom.f ront.of .list  (open); 
nodes.expanded  : =  nodea.expanded  +  1 ; 
depth  :=  current* .g.value; 

g. level [depth] .count  :  =  g . level [depth] . count  +  1; 
place.on_end_of.liat  (current,  g . level [depth] . list) ; 


rows'! 
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generate.successors  (currant*. state,  successor.list) ; 
while  not  is.empty  (sueceasor.list)  do 
bagin 

c  : =  remore_from_front_of .list  (successor. list) ; 
c*.g_ value  :=  dapth  +  1; 
c*. f_ral.ua  :=  dapth  +  1; 
p  : =  find_in_traa  (c,  aeareh.tree) ; 
if  p  =  nil  than 
bagin 

insert.in.tree  (c,  aearch_trea) ; 
add.neighbor  (c,  current); 
place.in.ascending.order  (c,  open); 
end 

elaa 

bagin 

add_neighbor  (p,  currant); 

frea_noda  (c) ; 

and; 

nodee.genarated  :=  nodaa.ganarated  +  1; 
and; 

and; 

g. dapth  :=  dapth; 

g. generated  : =  nodaa.ganarated; 

g. expanded  : =  nodes.expanded ; 

free.liat  (open) ; 

free.list  (sueceasor.list) ; 

and; 


(* 

*  Problem  solution  routines 
*) 


[global] 

procedure  initialize.reeults  (rar  r  ;  results.descriptor) ; 
bagin 

r. heuristic  :=  0; 
r. weight  :=  0.0; 
r. generated  :=  0; 
r. expanded  :■  0; 
r. path. length  :=  0; 
r . min_path_length  :=  0; 
r.atart  :=  nil; 
r.goal  :  =  nil; 
end; 

[global] 

procedure  print_puzzle_aolution  (r  :  results.descriptor) ; 

procedure  pps  (n  :  node.ptr) ; 
bagin 

if  n* . state  <>  r . start* . state  then 
pps  (n*. parent); 
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print. state  (n* . state,  nun.positions) ; 

•riteln; 

and; 

bagin 

ppa  (r.goal) 
and; 


[global] 

procadure  ordered. search  (start,  goal  :  puzzle.state ; 

heuristic  :  integer; 

•eight  :  real ; 

var  results  :  result s.descriptor) ; 

ver 

open,  successor .list ,  search, tree  :  node.ptr; 
start .node,  current,  p,  c  :  node.ptr; 
nodes .generated,  nodes .expanded  :  integer; 

begin 

create.enpty.list  (open) ; 
create.enpty.list  (successor.list) ; 
search. tree  : =  nil; 

nodea.generated  :=  1; 
nodes. expanded  0; 

start.node  :*  create.puzzle.node  (start); 
start .node" .g.value  :=  0; 

start.node"  h.value  :=  estimated.distance  (heuristic,  start.node, 

goal,  resultB  min. path. length) ; 
start.node". f .value  :=  (1.0  -  weight)  *  start.node* .g.value  + 

•eight  *  start .node*. h.value; 
ineert.in.tree  (start.node,  search.tree) ; 
place.in.ascending. order  (start.node,  open); 

repeat 

current  :=  renove.fron.front.of .list  (open); 
if  (current*. state  <>  goal)  then 
begin 

nodes.expanded  :=  nodes.expanded  +  1; 
generate.successors  (current* . state,  successor.list); 
while  not  is.enpty  (successor.list)  do 
begin 

c  :=  renove.fron.front.of .list  (successor.list); 
c*. g.value  ;=  current* .g.value  +  1; 
p  :=  find.in.tree  (c,  search.tree); 
if  p  =  nil  then 
begin 

c*. parent  : =  current; 

c*. h.value  :  =  estinated.distance  (heuristic,  c, 
goal,  results. nin.path.length) ; 
c*.f_value  :=  (1.0  -  weight)  *  c*. g.value  + 


weight  *  c*.h_ value; 
insert _in_tree  (c,  search.tree) ; 
place_in_ascending_order  (c,  open); 
end 

else 

begin 

i f  e*.g_value  <  p*.g_value  then 
begin 

p*. parent  :=  current; 
p*.g_ value  :=  e*.g_ value; 
p*.f .value  :=  (1.0  -  weight)  *  p'.g.value 
weight  *  p*.h_value; 
if  not  ((p'.left  =  nil)  and 

(p*. right  =  nil))  then 
delete.from.list  (p,  open); 
plaee.in.ascending.order  (p.  open); 
end; 

free.node  (c) ; 
end; 

nodea.generated  : =  nodea.generated  ♦  1; 
end; 

end; 

until  ie.eapty  (open)  or  (current* . state  =  goal); 

if  current* . state  =  goal  then 
resulte . solved  : =  true 

else 

resulte. solved  :  =  false; 
results . start  :=  start. node; 
results. goal  :=  current; 
results .heuristic  heuristic; 
results. weight  : =  weight; 
results. path.length  : =  current* ,g_ value; 
resulte. generated  nodea.generated; 
results . expanded  : =  nodes. expanded; 
free.list  (succeasor.liat) ; 
free.node  (open) ; 
end; 

[global] 

procedure  graph.search  (start,  goal  :  puzzle. state ; 

heuristic  :  integer; 
weight  :  real; 

var  results  :  reaults.descriptor)  ; 

var 

open,  succeesor.list ,  search. tree  :  node.ptr; 
start .node,  current,  p,  c  ;  node.ptr; 
nodea.generated,  nodes.expanded  :  integer; 


procedure  update  (p  :  node.ptr) ; 


n  :  neighbor .node.ptr ; 

■  :  node.ptr; 
begin 

n  :=  p* . neighbors ; 
while  n  <>  nil  do 
begin 

n  : =  n*. neighbor; 

if  (p“.g_v«lue  +  1)  <  B*.g_value  then 
begin 

a*. g. value  :=  p'.g.value  +  1; 
a*.f .value  :=  (1.0  -  weight)  *  a*. g_ value  + 
weight  *  a'.h.value; 

if  not  ((a*. left  =  nil)  and  (a*. right  -  nil))  then 
begin 

delete.froa.list  (a,  open) ; 
place.in.ascending.order  (a,  open) ; 
end; 

update  (a) ; 
end; 

n  :=  n* .next; 
end; 

end; 

begin 

create_enpty_list(opan) ; 
create.eapty.list (successor.list) ; 
search.tree  nil; 

nodes.generated  : =  1; 
nodes .expanded  :=  0; 

start.node  create_puzzle_node  (start); 
start _node~.g_ value  : =  0 ; 

start.node* .h_ value  :=  estiaated.distance  (heuristic,  start.node, 

goal,  result s. ain.path.length) ; 
start.node* .f .value  : =  (1.0  -  weight)  *  start.node* .g_ value  + 

weight  *  start.node* .h_ value; 
insert.in.tree  (start.node,  search.tree) ; 
place.in.ascending.order  (start.node,  open); 

repeat 

current  :=  reaove.froa.front.of .list  (open); 
if  (current* . state  <>  goal)  then 
begin 

nodes.expanded  :=  nodes. expanded  +  1; 
generate.successors  (current* . state ,  successor. list) ; 
while  not  is.eapty  (successor.list)  do 
begin 

c  :=  reaove.froa.front.of .list  (successor.list); 
c“.g_ value  :=  current* .g.value  +  1; 
p  :=  find.in.tree  (c,  search.tree); 
if  p  =  nil  then 
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c  .parent  : =  currant; 

c“.h_value  : =  estimated.distance  (heuristic,  c, 
goal ,  reaulta . min.path. length) ; 
c*.f .value  :=  (1.0  -  weight)  *  c*.g_value  + 
weight  *  c*.h_value; 
add. neighbor  (c,  current); 
insert.in.tree  (c,  search.tree) ; 
place. in.aecending.order  (c,  open); 
end 

else 

begin 

add.neighbor  (p,  current); 
if  c*.g_value  <  p*.g_value  then 
begin 

p*.g_ value  :=  c*.g_value; 

p“.f_value  :=  (1.0  -  weight)  *  p'.g.value  + 
weight  *  p“.h_value; 
p*. parent  ;=  current; 
if  ((p'.laft  =  nil)  and 

(p*. right  *  nil))  then 
update  (p) 

else 

begin 

delete.froa.list  (p,  open); 
place. in.aecending.order  (p.  open); 

end; 

end; 

frea.node  (c); 
end; 

nodea.generated  : =  nodea.generated  +  1; 
end; 

end; 

until  is.eapty  (open)  or  (current* . state  =  goal); 

if  current* . state  =  goal  then 
results . solved  :*  true 

else 

results . eolved  :=  false; 
reeults . start  : =  start .node; 
results . goal  : =  current ; 
results. heuristic  heuristic; 
results. weight  :=  weight; 
results.path.length  :=  current* . g_ value ; 
results .generated  :  =  nodea.generated; 
results . expanded  : =  nodes. expanded ; 
free.liat  (auccessor.list) ; 
frea.node  (open) ; 
end; 


end. 
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Heuriatica  Module  (t4.1  23-Feb-86  AJC/SRH) 


module  hauriatic  (input,  output,  prof ile.input ,  prof ile.output) ; 
conat 

(* 

*  Froai  UTILITIES  Inport  CONST 
•) 

max.poaitiona  =  10; 
no.puzzle  =  ’  * ; 

cantor  =  1 ; 
firat  =  2; 

(* 

*  From  CONTROL  Inport  CONST 
*) 

max.linka  =  max.poaitiona; 
max. level a  =  00; 

(* 

*  From  STATISTIC  Inport  CONST 
*) 

max.paira  =  100; 

(* 

*  HEURISTIC 
*) 

max. heuriatica  =  24; 
nax.n  ■  20; 
max.k  =  75; 
max.nama  =  10; 
nax.file.name  =  30; 


typo 

(* 

*  Fron  UTILITIES  Inport  TYPE 
*) 

poaitiona  =  cantor. . max.poaitiona ; 
puzzle.atate  =  packod  array  [poaitiona]  of  char; 

noda.ptr  =  'puzzle.node ; 

neighbor _noda_ptr  =  'neighbor .node ; 
neighbor.node  =  record 

neighbor  :  noda.ptr; 
next  :  neighbor .noda.ptr ; 
and; 

puzzla.node  =  record 

atata  :  puzzlo.atata ; 

left,  right,  aort.left,  aort.right  :  noda.ptr; 
naighbora  :  noighbor.nodo.ptr ; 
parent  :  noda.ptr; 


V/7 


g_ value,  h.value  :  integer; 

f .value  :  real; 

and; 

k 

*  From  CONTROL  Import  TYPE 

0 

link.array  =  array  [first . .max.linka]  of  boolean; 

level.record  =  record 

count  :  integer; 
liat  :  node.ptr; 
end; 

level.array  =  array  [0 . .max.levels]  of  level.reeord; 

graph. descriptor  =  record 

depth,  generated,  expanded  :  integer; 

level  :  level.array; 

end; 


*  From  STATISTIC  Import  TYPE 

*> 

distribution. index  =  1 . . max.pairs; 

distribut ion.type  =  (normal,  linear,  nonlinear); 

diatribution.pointer  =  "distribution.record; 
dietribution.record  =  record 

name  :  diatribut ion.type ; 
pairs  :  distribution. index ; 

abscissa,  ordinate  :  array  [distribution.index]  of  real; 
end; 


«  HEURISTIC 
*) 

name. string  =  packed  array  [1 . .max.name]  of  char; 
f ile.name.string  =  packed  array  [1 . .max.file.name]  of  char; 

prof ile.pointer  =  "prof ile.record; 
prof ile.record  =  record 

name  :  name. string; 

heuristic  :  0. .max.heuristics; 

min,  max,  count  :  array  [0. .max.n]  of  integer; 

mean,  stdev  :  array  [0. .max.n]  of  real; 

histogram  :  array  [0. .max.n,  0.  .  max.k]  of  integer; 

end; 


*  From  CONTROL  Import  VAR 
*) 

link  :  [external]  link.array; 


V.S  .  V.v.--- s' .  V.-.  . 


nua.poaitiona  :  [external]  integer; 


Local 


•tart  :  puzzle.atate; 

inv. search. tree,  inv.graph  :  node.ptr; 

inv_g  :  graph.deacriptor; 

prof ile.input ,  prof ile. output  :  text; 

frequency  :  array  [l . .max.heuristics ,  O.max.n,  O.max.k] 
of  intagar; 

profila  :  array  [1 . .max.heuristics]  of  prof ile.pointer ; 


*  From  UTILITIES  Import: 


[external] 

function  ereate.puzzle.node  (a  :  puzzle.atate)  :  node.ptr; 
external ; 

[external] 

procedure  free_binary_tree  (var  t  :  node.ptr) ; 
external; 

[external] 

function  find_atate.in.tree  (a  :  puzzle.atate ; 

tree  :  node.ptr)  :  node.ptr; 

external ; 


From  CONTROL  Import : 


[external] 

procedure  initialize.grapb.daacriptor  (var  g  :  graph.deacriptor) ; 
external ; 

[external] 

procedure  generate.graph  (start  :  puzzle.atate; 

var  aearch.tree,  graph  :  node.ptr; 
var  g  :  graph.deacriptor) ; 

external ; 


From  STATISTIC  Import : 


[external] 

procedure  initialize.atatiatica; 
external ; 


V.V.V.VeO/ 


>  .N  ->• 


[external] 

function  random. integer .between  (i,  n  :  intogor)  :  intogor; 
•xtornal; 

[oxtornal] 

function  random.by.distribution  (d  :  diatribution.type)  :  r 
•xtornal ; 


HEURISTIC 


procedure  init ializo. input _pr of iles ; 
war 

h  :  intogor; 
bogin 

for  h  :=  1  to  max.houriatica  do 
profilo[h]  :  =  nil; 

•nd; 

[global] 

procedure  croato.prof ilo  (rar  p  :  prof ilo.pointor) ; 
war 

i.  j  :  intogor; 
bogin 
now  (p) ; 
with  p*  do 
bogin 

for  i  : =  1  to  max.namo  do 
□ame[i]  :=  '  ’; 
heuristic  :=  0; 
for  i  :*  1  to  max.n  do 
bogin 

min[i]  : =  max.k; 
max[i]  :=  0; 
count [i]  :=  0; 
moan[i]  :  =  0.0; 

•tdow[i]  :=  0.0; 
for  j  :=  1  to  max.k  do 
histogram[i,  j]  :=  0; 

•nd; 

•nd; 

•nd; 

[global] 

procoduro  road.prof iloo  (filo.namo  :  file. name. string) ; 
war 

p  :  prof ilo.pointor ; 
namo  :  name. string; 

numbor.of .ontrios ,  frequency  :  integer; 
nsum,  ksum,  k2sum,  percent  :  real; 
i,  k,  n  :  integer; 
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open  (prof ilo.input ,  file .name ,  history  :=  old); 
roast  (prof ilo.input) ; 
while  not  oof  (prof ilo.input)  do 
begin 

create.prof ilo  (p) ; 
with  p*  do 
bogin 

roadln  (prof ilo.input,  nano,  heuristic, 
nunbor.of .entries) ; 
for  i  :=  1  to  number .of .entries  do 
begin 

roadln  (prof ilo.input ,  n,  k,  percent,  frequency); 

histogram[n,  k]  :=  frequency; 

end; 

for  n  :=  0  to  nax.n  do 
bogin 

noun  :=  0.0; 

ksum  :=  0.0; 

k2aun  :=  0.0; 

for  k  :=  0  to  max.k  do 

if  (histogram  (n,  k]  <>  0)  than 
bogin 

nsun  :=  nsun  +  histogram  [n,  k] ; 
ksum  :=  ksum  +  k  *  histogram  [n,  k] ; 
k2sum  :=  k2sum  +  k  *  k  *  histogram  [n,  k] ; 
if  (k  <  min[n])  then 
min[n]  :=  k; 
if  (k  >  max [a])  then 
max[n]  : =  k; 

count [n]  ;=  count [n]  +  histogram  [n,  k] ; 
end; 

moan[n]  : =  ksum  /  nsun; 

stdev[n]  :=  sqrt  (abs  (ksum  *  ksum  /  nsum  -  k2sum) 

/  (nsum  -  1)) ; 

and; 

end; 

prof ilo [p*. heuristic]  :=  p; 
and; 

close  (prof ilo.input) ; 
end; 

procedure  initializo.output.prof ilos; 
war 

h,  n,  k  :  integer; 
begin 

for  h  :=  1  to  max.heuri sties  do 
for  n  :=  0  to  max.n  do 

for  k  :=  0  to  max.k  do 

frequency[h,  n,  k]  :=  0; 

end; 

[global] 

procedure  initialize.heuristics ; 
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initialize. statistics ; 

iait ializa_graph_dascri.pt or  (inv.g) ; 

iav .graph  : =  craata_puzzla_aoda  (ao.puzzle) ; ; 

inv. search. tree  :=  iav.gr aph; 

iaitializa.iaput.prof ilas ; 

iaitializa.output .prof ilea ; 

ead; 

[global] 

procedure  priat.prof ilas  (file.name  :  f ile.name.string) ; 
var 

h,  a,  k,  sun,  auabar.of .entries  :  integer; 
name  :  name.string; 

procedure  integer.to.string  (i  :  integer;  var  s  :  name.string) 
var 

j  :  integer; 
begin 

for  j  :=  1  to  max  name  do 
etj]  :«  ’  ♦; 
j  : =  max. name ; 

while  (i  >  0)  and  (j  >“  1)  do 
begin 

n  :=  i  mod  10; 
i  :=  i  div  10; 
s[j]  :=  chr(n  +  ord('O')); 

j  :*  j  -  i; 

end; 

end; 

begin 

open  (prof ile.output ,  file.name,  history  :=  new); 
rewrite  (prof ile.output) ; 
for  h  :=  1  to  max.heuristics  do 
begin 

number.of .entries  :=  0; 
for  n  :=  0  to  max.n  do 
begin 

for  k  :=  0  to  max.k  do 

if  frequency [h,  n,  k]  <>  0  then 

number.of .entries  :=  number.of .entries  +  1; 

end; 

if  number.of .entries  >  0  then 
begin 

integer.to.string  (h,  name); 
writeln  (prof ile.output ,  name,  ’, 

h : 5 ,  number.of .entries : 10) ; 
for  n  :=  0  to  max.n  do 
begin 
sum  : =  0 ; 

for  k  :=  0  to  max.k  do 

sum  :  =  sum  +  frequency [h,  n,  k] ; 
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for  k  :=  0  to  nax.k  do 

if  f requency [h .  a,  k]  <>  0  than 

vriteln  (profile. output,  n:6,  k:5, 

( (frequency [h,  n,  k] *100) /sun): 6: 1 , 
frequency [h,  n,  k] : 10) ; 

end; 

end; 

end; 

close  (prof ile.output) ; 
end; 


heuristic  calculation  routinea 


function  tile.position  (tile  :  char;  goal  :  puzzle.state)  :  integer; 

▼ar 

i  :  integer; 
begin 

i  : -  center ; 

while  (goal[i]  <>  tile)  do 
i  :*  i  +  i; 
tile.position  : *  i; 
end; 

function  perineter.distance  (tilel.pos,  tile2_pos  :  integer)  :  integer; 
war 

d  :  integer; 
begin 

d  : =  abs  (tilel.poa  -  tile2_poa) ; 
if  (d  >  (num.poaitiona  -  1  -  d))  then 
d  : =  nun.positions  -  1  -  d; 
perineter.distance  :■  d; 
end; 

function  center.diatance  (tile.pos  ;  integer)  :  integer; 

▼ar 

d,  i  :  integer; 
begin 

if  tile.pos  -  center  then 
center.distance  : =  0 

else 

begin 

d  :=  nun.positions; 

for  i  :*  first  to  nun.positions  do 

if  link(i]  and  (perinstsr.distance  (i,  tile.pos)  <  d)  then 
d  : =  perineter.distance  (i,  tile.pos); 
center.distance  :=  d  +  1; 
end; 

end; 

function  tiles.nisplaced  (current,  goal  :  puzzle.state)  :  integer; 


i.  a  :  integer ; 
begin 
n  :*  0: 

for  i  :*  center  to  nun.poaitiona  do 

if  (currant [i]  <>  goal[i])  and  (currant [i]  <>  chr(O))  than 
n  :«  n  +  1; 
tilea.aieplaced  :■  n; 
and; 

function  aanhattan.diatanca  (currant,  goal  :  puzzle.etate)  :  integer ; 
ear 

■.diet,  i,  j,  d,  d2  :  integer; 
begin 

■.diet  : *  0 ; 

for  i  :*  center  to  nun.poaitiona  do 
begin 

j  :=  tile.poaition  ( current [i],  goal); 
if  (i  <>  j)  and  (current [i]  <>  chr(0))  then 
begin 

if  (i  *  center)  then 

d  :«  center _diatance  (j) 
elae  if  (j  ■  center)  then 
d  : »  center .diatance  (i) 

elae 

begin 

d  :»  canter .diatance  (j)  +  center .diatance  (i) ; 
d2  :■  perieeter .diatance  (i,  j): 
if  (d2  <  d)  then 
d  :=  d2; 

end; 

■.diet  :=  n.diet  +  d; 
end; 

end; 

■anhattan.diatance  :»  n.diat; 
end; 

function  enhanced.nanhettan.diatanee  (current,  goal  :  puzzle.etate)  : 
integer ; 

rar 

i,  j,  next.i,  next.j.  acore  :  integer; 
begin 

acore  :=  0; 

for  i  :■  firat  to  nun.poaitiona  do 
begin 

if  (current ['I  <>  chx(0))  then 
begin 

j  :=  tile.poaition  (current[i],  goal); 
if  (j  <>  center)  then 
begin 

if  (j  =  nun.poaitiona)  then 
next.j  :=  firat 

elae 

next.j  :=  j  +  1; 
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if  (i  *  nua.poeitiona)  than 
naxt.i  : =  fir at 

alaa 

naxt.i  :M  i  +  1; 

if  (currant [naxt.i]  <>  goal [next. j] ) 
and  (currant [naxt.i]  <>  ehr(O)) 
and  (goal [naxt.i]  <>  chr(O))  than 
acora  :=  acora  +  2; 

and; 

and; 

and; 

anhancad.nanhattan.diatanea  :  = 

nanhattan.diatanca  (currant,  goal)  +  3  •  acora; 

end; 

function  aiaulated.by.hiatograa  (n  :  integer; 

p  :  prof ile.pointer)  :  integer; 

var 

j,  k,  accua  ;  integer; 
begin 
with  p*  do 
begin 

k  ;=  ain[n] ; 

accua  :=  hiatograa[n,  k] ; 
j  :=  random. integer .between  (1,  count [n]); 
while  (j  >  accua)  do 
begin 

k  :=  k  ♦  1; 

accua  :«  accua  +  hiatograa[n,  k] ; 
end; 

end; 

aiaulated.by.hiatograa  :=  k; 
end; 

function  simulated.by .distribution  (n  ;  integer; 

p  :  prof ile.po inter; 
d  :  distribution. type)  :  integer; 

war 

i  :  integer; 
r  ;  real; 
begin 
with  p“  do 
begin 

r  :=  random. by. distribution  (d) ; 
i  :=  round  ((etdev[n]  *  r)  ♦  aean[n]); 
if  (i  <  0)  than 
i  :=  0; 

end; 

aiaulatad.by.diatribution  :=  i; 

and; 

function  worat.caaa.by .profile  (current  :  node.ptr; 

n,  c.atar  :  integer; 


p  :  prof ile.po inter)  :  iatogor; 

rar 

k  :  iatagar; 
bagia 
with  p~  do 

if  ( (current*. g.Talua  +  a)  >  c.atar)  thaa 
k  :*  round  (naaa [a]  -  2  *  atdav[a]) 

alaa 

k  :=  rouad  (aaaa[a]  +  2  *  atdaT[a]); 
if  k  <  0  tbaa 
k  :*  0; 

•orat.caae.by.prof ile  :=  k; 
aad; 

function  proport ional.err or  (a  :  iatagar; 

r,  1  :  raal; 

d  :  diatribution.type)  ;  iatagar; 

Tar 

b  :  raal; 
bagia 

b  : *  randoa.by.diatribution  (d) ; 

proportional.error  :»  round  (a  *  (1  ♦  1  +  b*(r  -  1))); 
and; 


[global] 

function  eatiaated.diatance  (hauriatic  :  iatagar; 

currant  :  aoda.pt r; 

goal  ;  puzzle.atate ; 

c.atar  :  iatagar)  :  integer; 

Tar 

a,  k  :  integer; 

inr. currant  ;  noda.ptr; 

begin 

if  currant* . atata  =  goal  than 
k  :=  0 

alaa 

bagia 

if  (iaT.graph* .atata  <>  goal)  than 
bagia 

free.binary.trea  (inT.aearch.tree) ; 
initializa.grapb.daacriptor  (inT.g) ; 
generate. graph  (goal.  inT.aaareh.traa . 

iaT.graph,  iar.g) ; 

aad; 

inr. currant  ; =  find.atate_in.tree  (currant* . atata, 

inT.aaareh.traa) ; 

a  : =  inT.curraat* .g.Talua ; 
caaa  hauriatic  of 

1  :  k  :*  tilea.aiaplaced  (currant* .atata,  goal); 

2  :  k  :»  aanhattaa.diataace  (currant* . atata,  goal); 


3  :  k  :=  enhanced.nanhattan.diatance  (current* . atate,  goal); 


4,6,6  :  k  : =  sinulated.by .distribution  (n,  prof ilo [houriatic-3] , 

normal) ; 

7,8,9  :  k  ; =  sinulated.by.hiatogran  (n,  prof ilo [hauriatic-6] ) ; 

10.11, 

13  :  k  :=  •orat.case.by.prof ilo  (currant,  n,  c.atar, 

prof ila [hauriatic-9] ) ; 

13  :  k  :=  proport ional.orror  (n,  -0.6,  -1.0,  linaar) ; 

14  :  k  proportional.arror  (n,  0.0,  -0.7,  linaar); 

16  :  k  :=  proportional.arror  (n,  0.0,  -0.5,  linaar); 

16  :  k  :=  proportional.arror  (n,  0.5,  -0.2,  linear); 

17  :  k  :=  proportional.arror  (n,  -0.5,  -1.0,  nonlinear); 

18  :  k  proportional.arror  (n,  0.0,  -0.7,  nonlinear); 

19  :  k  :=  proportional.arror  (n,  0.0,  -0.6,  nonlinear); 

20  :  k  proportional.arror  (n,  0.6,  -0.2,  nonlinear); 

and; 

and; 

frequency [heuristic ,  n,  k]  :*  frequency [heuristic,  n,  k]  +  1; 

eetiaated.distance  : »  k; 

and; 


Statistics  Nodule  (v4.0  23-Fab-86  AJC/SRH) 


module  statistics  (input,  output,  diatribution.f ile) ; 
const 

aax.pairs  *  100; 
aax.naae  =  10; 
aax.file.naae  =  30; 
seed  =  4489; 


distribution.index  =  1 . . aax.pairs ; 

naae.atring  s  packed  array  [1 . .aax.naae]  of  char; 

file.naae.string  =  packed  array  [l. ,max_f ila.naae]  of  char; 

dietribution.type  *  (noraal,  linear,  nonlinear); 

distribution.pointer  «  *distribution_record; 
diatribution.record  m  record 

nans  :  dietribution.type; 
pairs  ;  distribution.index; 

abscissa,  ordinate  :  array  [dietribution.index]  of  real; 
end; 


randoa.eeed  :  integer; 

distribution  :  array  [dietribution.type]  of  distribution. pointer ; 
diatribution.f ils  :  text; 

(global] 

procedure  initialize.etatiatics; 
begin 

randou.seed  ; =  seed; 
distribution [noraal]  :=  nil; 
diatribution[linear]  :*  nil; 
distribution [nonlinear]  :*  nil; 
end; 

[external,  asynchronous] 

function  athlrandoa  (rar  seed  :  integer)  :  real; 
extern ; 

[global] 

function  randoa.integer.betveen  (a,  n  :  integer)  :  integer; 

rar 

p  :  real; 
q  :  integer; 
begin 

p  :*  BthSrandoa  (randoa.eeed) ; 
q  :■  round  (p  *  (n  -  a))  +  a; 


*  VVV 


'  fi.*  f1  -  ' Hit 


if  (a  <=  q)  and  (q  <*  n)  than 
random. int agar .bat waan  : =  q 

alsa 

bagin 

writaln  ('Warning  --  random  out  of  bounds’); 

halt; 

and; 

and; 


*  Distribution  functions 
*) 

procadura  crsate.distribution  (war  d  :  distribution.pointar) ; 
var 

i  :  intagar; 
bagin 
naa  (d) ; 
with  d“  do 
bagin 

pairs  :»  0; 

for  i  :*  1  to  nax.pairs  do 
bagin 

abscissa [i]  :=  0.0; 
ordinata[i]  : =  0.0; 
and; 

and; 

and; 

[global] 

procadura  rsad.distributions  (f ile.name  :  fils.nams.string) ; 

Tar 

d  :  distribution.pointar; 
i  :  intagar; 
bagin 

opan  (distribution.f ila ,  filo.nama,  history  :*  old); 
rasat  (distribution.f ila) ; 
whila  not  aof  (distribution.f ila)  do 
bagin 

craata.distribution  (d) ; 
with  d~  do 
bagin 

raadln  (distribution.f ila,  nama,  pairs); 
for  i  1  to  pairs  do 

raadln  (diatribution.f ila,  abscissa[i],  ordinata[i]) ; 

and; 

distribution [d* .nama]  :=  d; 
and; 

cloaa  (distribution.f ila) ; 

and; 

[global] 

function  randoa.by .distribution  (d  :  distribution.typs)  :  raal; 


v>».v.y'v.v.v. 


i  :  intagar ; 
x  :  raal; 
dona  :  boo la an; 
bagia 

x  :=  nthlrandon  (randon.saad) ; 
if  distribution^]  <>  nil  than 
with  distribution [d] *  do 
bagin 

i  :=  1; 

dona  : =  falsa; 

vhila  (i  <=  pairs)  and  (not  dona)  do 
bagin 

if  x  <=  ordinata[i]  than 
dona  trua 

alsa 

i  :*  i  +  1; 

and; 

if  dona  than 

randoa.by .distribution  : =  abscissa [i] 

alaa 

bagin 

sritaln  (’  ***  Error:  Randoa  not  found  in  distribution'); 
halt; 


••ft******************************************************************** 


APPENDIX  B 


Beads  World  Toola  Dafinition  Nodulas 


Utilitiaa  Dafinition  Nodule  (v4.0  23-Feb-86  AJC/SRH) 


From  UTILITIES  Inport  CONST 

nax.positione  =  10; 
no.puzzle  =  * 
cantar  3  1; 
first  3  2; 


Fro*  UTILITIES  I*port  TYPE 

positions  3  cantar . ,*ax. positions; 
puzzle.stata  3  packed  array  [positions]  of  char; 

node.ptr  3  'puzzls.node; 

neighbor .node.ptr  3  'neighbor .node ; 
neighbor.node  3  record 

neighbor  :  node.ptr; 
next  :  neighbor .node.ptr; 
end; 

puzzle.node  3  record 

state  :  puzzls.state; 

left,  right,  aort.left,  sort .right  :  node.ptr; 

neighbors  :  neighbor.nods.ptr ; 

parent  :  node.ptr; 

g. value,  h.value  :  integer; 

f .value  :  real; 

end; 


Fro*  UTILITIES  Iaport: 


[external] 

procedure  read. state  (var  a  ;  puzzle. state ;  n  :  integer); 
external ; 
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[external] 

procedure  print.state  (e  :  puzzle.etate;  n  :  integer); 
external ; 

[external] 

function  create.puzzle.node  (a  :  puzzle.etate):  node.ptr; 
external ; 

[external] 

procedure  free.node  (var  n  :  node.ptr) ; 
external ; 

[external] 

procedure  create.empty.list  (var  list  :  node.ptr) ; 
external ; 

[external] 

function  is.empty  (list  :  node.ptr)  :  boolean; 
external ; 

[external] 

procedure  place.on.end.of .list  (p,  list  :  node.ptr); 
external ; 

[external] 

procedure  place.in.ascanding.order  (p,  list  :  node.ptr); 
external; 

[external] 

function  remove.from_front.of .list  (list  :  node.ptr)  :  node.ptr; 
external; 

[external] 

procedure  delete.from.list  (p,  list  :  node.ptr); 
external; 

[external] 

procedure  free.list  (var  list  :  node.ptr) ; 
external ; 

[external] 

procedure  insert. in.tree  (n  :  node.ptr;  var  tree  :  node.ptr); 
external ; 


m 
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[external] 

function  find. in.tree  (n,  tree  :  node.ptr)  ;  node.ptr; 
external; 

[external] 

function  f ind.state.in.tree  (s  :  puzzle. state ; 

tree  ;  node.ptr)  :  node.ptr; 

external; 


&VM 


Control  Definition  Modulo  (v4.0  23-Feb-86  AJC/SRH) 


From  CONTROL  Import  CONST 

max.links  =  max.positions ; 
max_levola  =  99; 


From  CONTROL  Import  TYPE 

link.array  =  array [first . .max.positions]  of  boolean; 

level.record  =  record 

count  :  integer; 
list  :  node_ptr; 
end; 

level.array  =  array  [0. .max.levels]  of  level.record; 

graph. descriptor  =  record 

depth,  generated,  expanded  :  integer; 

level  :  level.array; 

end; 

results. descriptor  =  record 
solved  :  boolean; 

path.length,  min.path. length,  generated, 
expanded,  heuristic  :  integer; 

•eight  :  real ; 

start,  goal  :  node.ptr; 

end; 


From  CONTROL  Import  VAR 

num.poeitions ,  num. links  :  [global]  integer; 
link  :  [global]  link.array; 


From  CONTROL  Import: 


[external] 

procedure  initialize.control  (np,  si  :  integer;  1  :  link.array); 
external; 

[external] 

procedure  initialize.graph.deecriptor  (var  g  :  graph. descriptor) 
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•xtarnal ; 


[•xtarnal] 

procadura  ganarata.graph  (atart.atata  :  puzzla.atata; 

▼ar  aaareh.traa,  graph  :  noda.ptr; 
▼ar  g  :  graph.daacriptor) ; 

•xtarnal ; 


[•xtarnal] 

procadura  initializa.rasulta  (war  r  :  rasults.dascriptor) ; 
•xtarnal ; 

[•xtarnal] 

procadura  print.puzzla.solution  (r  :  rasulta.deacriptor) ; 
•xtarnal ; 

[•xtarnal] 

procadura  orderad.aearch  (start,  goal  :  puzzla.atata; 

hauristic  :  intagar; 
waight  :  raal ; 

war  raaulta  :  rasulta.daacriptor) ; 

•xtarnal ; 

[•xtarnal] 

procadura  graph.aaarch  (start,  goal  :  puzzla.atata; 

hauriatic  :  intagar; 
waight  ;  raal; 

var  raaulta  :  raaulta.dascriptor) ; 

•xtarnal ; 


Heuriatica  Dafinitiion  Nodula  (t4.0  23-Feb-86  AJC/SRH) 


From  HEURISTIC  Inport  CONST 

nax_hauriatica  =  24; 
max.f ile.name  =  30; 


From  HEURISTIC  Import  TYPE 

f ila.nama.atring  =  packad  array  [1. .max.f ile.name]  of  char; 


From  HEURISTIC  Import : 


[axtaxaal] 

procadura  read.prof ilaa  (f ila_nama  :  f ila.nama.atring) ; 
axtarnal; 

[axtarnal] 

procadura  print .prof ilaa  (fila.nama  :  {ila.nama.atring); 
axtarnal ; 

[external] 

procedure  initialize.heuriatica ; 
external; 

[axtarnal] 

function  aatimatad.diatanca  (heuriatic  :  integer; 

currant  :  node.ptr; 

goal  :  puzzle. atate ; 

c.atar  :  integer)  :  integer; 


axtarnal ; 
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(* 

*  Statistic*  Definition  Module  (v4.0  23~F*b-86  AJC/SRH) 
*) 

(* 

*  Fro*  STATISTICS  Inport  CONST 
*) 

aax.f ile.nane  =  30; 


<* 

*  Fro*  STATISTICS  Inport  TYPE 
*) 

f ile.nane.string  =  packed  array  [1 . .*ax_file_na*e]  of  char; 
distribution.type  =  (normal,  linear,  nonlinear); 

(* 

•  Fro*  STATISTICS  Inport : 

*) 


[external] 

procedure  initializ*_*tatiatica; 
external ; 

[external] 

function  rando*_ integer. between  (a,  n  ;  integer)  :  integer; 
external ; 

[external] 

procedure  read.distributions  (f ile.nane  :  f ile_nane_etring) ; 
external; 

[external] 

function  randon.by.distribution  (d  :  distribution. type)  :  real; 
external; 
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APPENDIX  C 


Boads  World  Applications  Nodulas 


************************************************************************ 


Application  GENERA TE.GRAPH  (t4.0  26-Fab-86  AJC/SRH) 


program  ganar at • .graph  (input,  output); 
const 

(* 

*  From  UTILITIES  Import  CONST 
*) 

max.positions  =  10; 
no.puzzlo  =  ‘  * ; 

cantor  *  1 ; 
first  =  2; 


*  From  CONTROL  Import  CONST 
*) 

max. links  -  max.positions; 
max.lavals  =  90; 


•  From  UTILITIES  Import  TYPE 
*) 

positions  =  cantor. .max.positions; 
puzzla.stata  =  packsd  array  (positions]  of  char; 

noda.ptr  -  ‘puzzlo.noda; 

naighbor.noda.ptr  =  ‘naighbor.noda; 
naighbor.noda  =  racord 

noighbor  ;  noda.ptr; 
naxt  :  naighbor.noda.ptr ; 
and; 

puzzlo.noda  =  racord 

stats  ;  puzzla.stata; 

laft,  right,  sort.laft,  sort.right  ;  noda.ptr; 

noighbors  :  naighbor.noda.ptr; 

parant  :  noda.ptr; 

g.valua,  h. ralua  ;  intogar; 

f .Talus  :  raal; 

and; 
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(* 

*  Fr ob  CONTROL  Import  TYPE 
*) 

link.array  *  array  [first . .max.positions]  of  boolaan; 

laval.racord  =  racord 

count  :  intagar; 
liat  :  noda.ptr; 
and; 

laval.array  ■  array  [0. .max.lavals]  of  laval.racord; 

graph.daacriptor  =  racord 

dapth,  ganaratad,  axpandad  :  intagar; 

laval  :  laval.array; 

and; 


,V*.  , 


link  :  link.array; 
nun.poaitiona,  nun.linka 


intagar; 


start  :  puzzla.atata ; 
saarch.traa,  graph  :  noda.ptr; 
problaa,  i,  nun,  opcoda,  ain.saapla 
gd  ;  graph.dascriptor ; 


intagar; 


(* 

*  Fron  UTILITIES  Import: 
*) 


[axtarnal] 

procadura  raad.stata  Oar  s 
axtarnal ; 


puzzla.atata;  n  :  intagar); 


[axtarnal] 

procadura  print. atata  (s 
axtarnal ; 


puzzla.atata;  n  :  intagar); 


(* 


[axtarnal] 

procadura  fraa.binary.traa  (var  t 
axtarnal; 


noda.ptr) ; 


[axtarnal] 

procadura  fraa.graph  (var  g 
axtarnal; 


From  CONTROL  Import: 


noda.ptr) ; 


m'n 


[axtarnal] 

procadura  initializa.control  (np,  nl 


intagar;  1  :  link.array); 


>  V 
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external ; 

[external] 

procedure  initialize.graph.deecriptor  (var  g  :  graph.deacriptor) ; 
external ; 

[external] 

procedure  generate.graph  (etart.etate  :  puzzle.otate; 

▼ar  search. tree,  graph  :  node.ptr; 

▼ar  g  :  graph.deacriptor) ; 

external; 


*  Froa  STATISTICS  Import; 

*) 

[external] 

function  random. integer .bet ween  (m,  n  ;  integer):  integer; 
external ; 


•  GEHERATE. GRAPH 
*) 


*  destructive  print  of  puzzle  etatee  in  graph 
*) 

procedure  print.graph.epace  (g  ;  node.ptr) ; 

▼ar 

n  :  neighbor .node.ptr ; 
begin 

if  g* . etate  <>  no .puzzle  then 
begin 

print.etate  (g~. state,  nua.poaitione) ; 
writeln; 

g*. etate  :=  no. puzzle; 
n  :=  g* . neighbora ; 
while  n  <>  nil  do 

begin 

print.graph.epace  (n* . neighbor) ; 

n  :=  n'.next; 

end; 

end; 

end; 

procedure  report.graph.etatietica  (problem  :  integer; 

g  :  graph.deacriptor) ; 

▼ar 

avg.neighbore,  tot,  lev  :  real; 
i,  j,  dupe,  apota  :  integer; 


%  V*  ! 
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begin 
write In; 
writeln; 
writeln; 

writeln  C  GRAPH  STATISTICS  ;  #  \  problem: 2); 
writeln; 

writeln  (’  Poeitione  :  num.positions : 2) ; 

write  (*  Links  '); 

for  i  :=  first  to  num.poeitione  do 
if  link[i]  then 
write  (i:3); 

writeln; 

writeln; 

write  (‘  Starting  Configuration  :  *); 

print. state  (start,  num.positions) ; 
writeln; 
writeln; 

writeln  (‘  Nodes  Generated  *,  g. generated: 6) ; 

writeln  ('  Nodes  Expanded  *,  g. expanded : S) ; 

■vg.neighbors  :-  (g. generated  -  1)  /  g. expanded; 

writeln  (‘  Avg  #  of  Neighbors  :  *,  avg.neighbors : 6 : 2) ; 

dups  :-  g. generated  -  g. expanded; 

writeln  ('  Number  of  "dupes"  *,  dups: 5); 

writeln; 

writeln  (‘  Depth  =  g. depth: 7); 
writeln; 

writeln  ( *  Nodes  at  each  level :  * ) ; 
writeln; 

tot  :*  g. expanded; 
for  i  :*  0  to  g. depth  do 
begin 

write  C  \  i:2,  •  —  \  g. level [i] .count :S,  *  '); 

lev  : =  g. level [i] .count; 
spots  :=  round (100  *  (lev  /  tot)); 
for  j  :*  1  to  spots  do 
write  (’*'); 
writeln; 
end; 
writeln; 
writeln; 
end; 

(* 

•  Sample  generation  routines 
•) 

function  f ind.ith. member  (i  :  integer;  list  :  node.ptr)  :  node.ptr; 
var 

k  :  integer; 
p  :  node.ptr; 
begin 

p  : =  list*. right; 
k  :=  1; 


m 


while  (p  <>  list)  and  (k  <  i)  do 
begin 

p  :■  p* . right ; 

k  :=  k  *  1; 

end; 

if  p  <>  list  then 

f ind.ith.aeaber  : =  p 

else 

find_ith_member  : *  nil; 

end; 

procedure  generate. aaeple  (g  :  graph.deacriptor) ; 

▼or 

start,  q  :  node.ptr; 
i.  j,  r,  saaple.size  :  integer; 
lev,  tot  ;  real; 
begin 

write  (nua.positions : 4,  nua.links : 4) ; 
for  i  : *  firet  to  nua.poaitions  do 
if  link[i]  then 
write  (i : 3) ; 

writeln; 

start  :=  find.ith.aeaber  (1,  g. level [0] . list) ; 
tot  :*  g. expanded ; 
for  i  :*  1  to  g. depth  do 
begin 

lew  :*  g. level [i] .count; 

saaple.size  :*  round  ((lev  *  100)  /  tot)  ♦  ain.saaple; 
if  g. level [i]  count  <  saaple.size  then 
saaple.size  :■  g. level [i] .count; 
for  j  :*  1  to  eaaple.size  do 
begin 
repeat 
begin 

r  :»  randoa. integer .between  (1,  trunc(lev)); 
q  ;■  find.ith.aeaber  (r,  g. level [i] . list) ; 
end 

until  q".g_ value  <>  0; 
q*.g. value  0; 
write  (i.4); 

print.state  (start" . state,  nua.poaitions) ; 
print. state  (q". state,  nua.poaitions); 
writeln; 
end; 

end; 

end; 


*  aain  prograa 
*) 


problem  :»  1; 
while  not  oof  do 
begin 

for  i  : =  first  to  aax.positiona  do 
link[i]  : »  falsa; 
read  (num.positions ,  nua.links) ; 
for  i  1  to  num.links  do 
begin 

read  (nua) ; 

link [nua]  :=  true; 

end; 

initialize.control  (num.positions,  nun. links ,  link); 
read  (opcode) ; 
if  (opcode  =  1)  then 
read  (nin.sample) ; 
readln; 

read_state  (start,  nua_positions) ; 
readln; 

initialize.graph.descriptor  (gd) ; 

generate. graph  (start,  aearch.tree,  graph,  gd) ; 

if  (opcode  =  0)  then 

report.graph.statistics  (problem,  gd) 
else  if  (opcode  =  1)  then 
generate.saapla  (gd) 
else  if  (opcode  =  2)  then 

print.graph. space  (graph) ; 
free.graph  (graph) ; 
problem  : =  problem  +  1; 


.V  *>»-'< 
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Application  SOLVE  (r4.0  23-Fsb-86  AJC/SRH) 


program  sol  vs  (input,  output); 
const 

(* 

*  From  UTILITIES  Inport  CONST 
*) 

nsx.positions  =  10; 
no_puzzl«  =  * 
csntsr  *  1; 
first  =  2; 


(* 

*  Fron  CONTROL  Inport  CONST 
*) 

nax. links  =  nsx.positions; 
nsx.lsTsls  *  00; 

(* 

*  Fron  HEURISTIC  Inport  CONST 
*) 

nsx.hsuriatics  3  24; 
nax.f  ilo.nsns  21  30; 

(« 

*  SOLVE 
*) 

no.f ils.nsno  =  ‘ 
nax.intagar  =  000000000; 
nax.vsights  =  11; 


typs 

(* 

*  Fron  UTILITIES  Inport  TYPE 
*) 

positions  -  csntsr. .nsx.positions; 
puzzls.ststs  =  packsd  srrsy  [positions]  of  chsr; 

nods.ptr  =  “puzzls.nods ; 

nsigbbor.nods.ptr  =  ‘nsighbor.nods ; 
nsighbor.nods  =  rscord 

nsighbor  :  nods.ptr; 
nsxt  :  nsigbbor.nods.ptr; 
snd; 


puzzls.nods  *  rscord 

ststs  :  puzzls.ststs; 

lsft,  right,  sort.lsft,  sort .right  :  nods.ptr 


neighbors  :  neighbor _nods_ptr; 
parent  :  node.ptr; 
g.vslue,  h_ value  :  integer; 
f .value  :  real ; 
end; 


Froa  CONTROL  Import  TYPE 

link.arrsy  =  array  [first. .aax. links]  of  boolean; 

reeults.descriptor  =  record 
solved  :  boolean; 

path.length,  ain.path.length,  generated, 
expanded,  heuristic  ;  integer; 
weight  :  real ; 
start,  goal  :  nodo.ptr; 
end; 


Froa  HEURISTIC  Import  TYPE 

file.naae.string  =  packed  array  [l . .aax.f ile.name]  of  char; 


SOLVE 

heuristic.array  =  array  [1 . .aax.heuristics]  of  integer; 
weight.array  *  array  [1 . .aax.weights]  of  real; 

aggregate. array  =  array  [1 . .aax.heuristics,  1 . .aax.weights]  of 

integer ; 

aggregate. statistics  =  record 
xmin,  xmax,  lain,  lmax, 

xtotal,  ltotal,  xaean,  la# an  ;  aggr agate. array ; 
end; 

aggregate.stats.ptr  -  'aggregate. statistics ; 
search.aethod.types  =  (ordered,  graph); 


var 

SOLVE 

nua.positions,  nua. links  :  integer; 
link  :  link. array; 

search.aethod  :  aearch.aethod. types; 

start,  goal  :  puzzle.stste; 
results  :  result s.descr iptor ; 


1 


£ 


m 


cT*. 


2m 


■ws: 

fyr 


$8 

■  *  «  ’  a 
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prof ile. input ,  prof ile.output , 


297 


distribution.input  :  file.name. string; 

i,  j,  k,  1,  nun,  number.of .heuristics , 
number. of .weights ,  opcode  ;  integer; 
heuristic  :  heuristic.array ; 
weight  :  weight.array ; 
s  :  sggregate.ststs.ptr ; 
old.n,  no.n,  n  :  integer; 


From  UTILITIES  Iaport: 

[external] 

procedure  read.state  (war  a  :  puzzle.state;  n  :  integer); 
external; 

[external] 

procedure  print.state  (a  :  puzzle.state;  n  :  integer); 
external; 

[external] 

procedure  free.binary.tree  (va r  t  :  node.ptr) ; 
external; 


Froa  CONTROL  Import: 

[external] 

procedure  initialize.control  (np,  nl  :  integer;  1  :  link.array) ; 
external; 

[external] 

procedure  initialize. results  (war  r  :  results.descriptor) ; 
external ; 

[external] 

procedure  print .puzzle. solution  (r  :  results.descriptor) ; 
external; 

[external] 

procedure  ordered.search  (start,  goal  :  puzzle.state; 

heuristic  :  integer; 
weight  :  real; 

▼ar  results  :  results.descriptor) ; 

external; 

[external] 

procedure  graph.search  (start,  goal  :  puzzle.state; 

heuristic  :  integer; 
weight  :  real; 

war  results  :  results.descriptor) ; 


axtarnal; 


From  HEURISTIC  Import: 


[axtarnal] 

procadura  initializa.hauristica ; 
axtaraal; 

[axtarnal] 

procadura  raad.prof ilaa  (fila.nama  :  fila_nama_string) ; 
axtaraal; 

[axtarnal] 

procadura  print.prof ilaa  (fila.nama  :  f ile_aama_atring) ; 
axtarnal ; 


From  STATISTIC  Import : 


[axtarnal] 

procadura  raad.diatributiona  (fila.nama  :  f ila.name.atring) ; 
axtarnal ; 


SOLVER 


procadura  raport_puzzla_raaults  (r  :  reaulta.daacriptor) ; 
bagin 
vritaln; 
vritaln; 

vritaln  (’  PROBLEM  SOLUTION  RESULTS  r. heuristic: 4,  r.veight :6:2) ; 

vritaln; 

if  not  r . solved  than 

writaln  ('  No  solution  found!  ’); 
vritaln; 

writs  ('  Start:  ‘); 

print.state  (r . start* . stats ,  num.positions) ; 
vritaln; 

writs  (’  Coal:  ’); 

print.stata  (r . goal* . atata ,  nua.poaitiona) ; 

vritaln; 

vritaln; 

vritaln  ('  Nodas  Ganaratsd  r . ganaratad: 6) ; 

vritaln  ('  Nodas  Expandad  r . axpanded: 6) ; 

vritaln  ('  Path  Langth  r  path. length : S) ; 

vritaln  (’  Minimum  Path  Langth  :  r .min.path.langth: 6) ; 


vritaln; 

vritaln; 

and; 


procedure  generate.data  (r  :  results.deacriptor) ; 
begin 

writeln  (r. heuristic: B. 

r. weight: 5: 2, 
r . min.path. length : 5 , 
r . path.length: 5 , 
r. generated: 5, 
r. expanded: 6) ; 


procedure  init.aggregate.reaulta  (a  :  aggregate_atats_ptr) ; 
war 

k,  1  :  integer; 
begin 

for  k  :=  1  to  number.of .heuristics  do 
for  1  :=  1  to  number.of .weights  do 
begin 
with  a*  do 
begin 

xmin[k] [1]  :=  nax.integer; 
xmean[k] [1]  :=  0; 


xnax  [k]  [1] 
lmin[k]  [1] 


:■  0; 

:=  nax.integer; 


ImeanCk]  [1]  :=  0; 
lmax [k] [1]  :=  0; 
xtotal[k]  [1]  :*  0; 
ltotalCk] Cl]  :=  0; 
end; 


procedure  print. aggregate.resulta  (old.n 
number _of .heuristics 
heuristic 
number .of .weights 
aggregates 


integer ; 
integer; 

heuristic.array ; 
integer; 

aggregate.stata.ptr) 


i,  j  :  integer; 
begin 

with  aggregates*  do 
begin 

for  i  :=  1  to  number. of .heuristic*  do 
begin 

write  (old.n: 3); 
write  (heuristic [i] : 3) ; 
for  j  :=  1  to  number. of .weights  do 
write  (xminli] [j] :6) ; 
writeln; 

write  (old.n: 3); 

write  (heuristic Ci]  3) ; 

for  j  :=  1  to  number.of .weights  do 


writ*  (xmean[i]  [j]  :6) ; 
writeln; 

writ*  (old_n:3); 
writ*  (b*uri*tic[i] :3) ; 
for  j  :*  1  to  nuaber.of .weights  do 
writ*  (xmax[i]  [j]  :0) ; 
writ*ln; 

writ*  (old_n:3); 
writ*  (h*uri*tic [i] : 3) ; 
for  j  :=  1  to  nuaber.of .weight*  do 
writ*  (lain[i]  [j]  :6) ; 
writ*ln; 

writ*  (old_n:3); 
writ*  (heuristic [i] :3) ; 
for  j  :=  1  to  nuaber.of _w*ight*  do 
writ*  (laean[i]  [j]  :6) ; 
writ*la; 

writ*  (old_n:3); 
writ*  (heuristic  [i] : 3) ; 
for  j  :=  1  to  auab*r_ of .weights  do 
writ*  (ImaxCiJ [j] :6) ; 
writeln; 

•ad; 

•ad; 

•ad; 


problem  solution  routin** 

procedure  solve  (search.aethod  :  search.method.types) ; 
begin 

for  i  :*  1  to  nuaber.of. heuristic*  do 
for  j  :*  1  to  nuaber.of .weights  do 
begin 

if  search.aethod  =  ordered  th*a 

ordered. search  (atart,  goal,  heuristic [i] , 
weight [j],  result*) 

•Is* 

graph.saarch  (start,  goal,  heuristic [i] , 
weight [j] ,  results) ; 
report.puzzle.results  (results) ; 
f rse.binary.tree  (results. start)  ; 
end; 

•ad; 

procedure  solwe.and.print  (aearch.aetbod  :  search.aethod. types) ; 
begin 

for  i  :*  1  to  number .of .heuristics  do 
for  j  :*  1  to  number .of .weights  do 
begin 

if  search.aethod  -  ordered  then 

ordered.eearch  (start,  goal,  heuristic [i] , 


weight [j ] ,  results) 

slss 

graph.search  (start,  goal,  heuristic [i] , 
weight [j] ,  results) ; 
report. puzzle.raaulta  (results) ; 
print .puzzle.eolution  (results) ; 
free.binary.tree  (results. start) ; 
end; 

end; 

procedure  solwe.and.generate  (sesrch.aethod  ;  search.method.types) ; 
begin 

for  i  :=  1  to  nuaber.of .heuristics  do 
for  j  :=  1  to  nuaber.of .weights  do 
begin 

if  search.aethod  =  ordered  then 

ordered.search  (start,  goal,  hauristic [i] , 
weight [j],  results) 

else 

graph.search  (start,  goal,  heuristic [i] , 
weight [ j ] ,  results) ; 

generate. data  (results) ; 
free.binary.tree  (results . start)  ; 
end; 

end; 

procedure  solve. and.aggregate  (search.aethod  :  search. nethod.types) ; 
begin 

if  (n  <>  old.n)  and  (old.n  <>  0)  then 
begin 

print .aggregate.results  (old.n, 

nuaber.of .heuristics , 
heuristic , 
nuaber.of .weights , 

a) ; 

init.aggregate.raaulta  (a) ; 

no.n  :=  1; 

end; 

old.n  :=  n; 

for  i  :=  1  to  nuaber.of .heuristics  do 
for  j  :=  1  to  nuaber.of .weights  do 
begin 

if  search.aethod  =  ordered  then 

ordered.search  (start,  goal,  heuristic[i] , 
weight [j],  results) 

else 

graph.search  (start,  goal,  heuristic [i] , 
weight [j],  results); 

with  a*  do 
begin 

if  resulte . expanded  <  xain[i][j]  then 
xain[i][jj  ;=  results. expanded; 
xtotal[i][j]  : =  xtotal[i][j]  +  results. expanded; 
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xaean[i][j]  :=  round(xtotal[i] [j]  /  no_n) ; 
if  results . expanded  >  xaax[i] [j]  then 
xaex[i][j]  :=  results. expanded; 
if  results. path.length  <  lnin[i][j]  then 
lain[i] [j]  :=  results .path.length; 
ltotal[i][j]  :=  ltotal[i][j]  +  results  path, length; 
laean[i][j]  :=  round(ltotal [i]  [j J  /  no_n) ; 
if  resulte .path.length  >  lnax[i][j]  then 
lnsx[i][j]  :=  results. path.length; 

end; 

free.binary.tree  (results . start) ; 
end; 


procedure  extract.f ile.naae  (var  f  :  file.nane. string; 

delimiter  :  char) ; 


i  ;  integer; 
ch  :  char; 
begin 

for  i  1  to  aax.f ile.naae  do 
f[i]  :=  *  •; 
eh  :*  '  '; 

while  (not  eoln)  and  (ch  <>  deliaiter)  do 
read  (ch) ; 

if  ch  3  deliaiter  then 
begin 
read  (ch) ; 
i  :*  1; 

while  (not  eoln)  and  (ch  <>  deliaiter) 
and  (i  <=  aax.f ile.naae)  do 
begin 

fti]  :=  ch; 
i  :*  i  +  1; 

read  (ch) ; 
end; 

end; 

end; 

(* 


s 

8 


3 


*  Main  prograa 
*) 

begin 

for  i  ;=  first  to  aax.positiona  do 
link[i]  :=  false; 
read  (nua.poaitiona,  nun. links) ; 
for  i  :*  1  to  nua. links  do 
begin 

read  (nun) ; 

link i^"a]  : =  true ; 

end; 

initialize.control  (nua.poaitiona,  nua.linka,  link); 


raadln  (opcoda,  aaarch.aathod) ; 
raad  (nuabar.of .hauriatica) ; 
fox  i  :*  1  to  nuabar.of .hauriatica  do 
rood  (hauriatic [i] ) ; 
raad  (nuabar.of .vaighta) ; 
for  i  :*  1  to  nuabar.of .vaighta  do 
raad  (vaight [i]); 
raadln; 

axtract.f ila.naaa  (prof i la .input ,  '“'); 
axtract.f ila.naaa  (prof ila.output , 
raadln; 

axtract.f ila.naaa  (diatribution.input, 
raadln; 

initializa.raaulta  (raaulta) ; 
if  opcoda  <  2  than 
bagin 
vritaln; 

vrita  C  Poaitiona  :  nua.poaitiona : 3) ; 

writa  ( *  Linka  * ) ; 

for  i  : =  firat  to  nua.poaitiona  do 
if  link(i]  than 
writa  (1:3) ; 

vritaln; 

vrita  (’  Hauriatica  :  nuabar.of .hauriatica : 3) ; 

for  i  1  to  nuabar.of .hauriatica  do 
vrita  (hauriatic [i] : 3) ; 
vritaln; 

vrita  (*  Vaighta  :  *,  nuabar.of .vaighta: 3) ; 

for  i  :=  1  to  nuabar.of .vaighta  do 
vrita  (vaight [i] : 6 : 2) ; 
vritaln; 
and 

alaa 

bagin 

vrita  (nua.poaitiona: 3,  nua_linka:3) ; 
for  i  : =  firat  to  nua.poaitiona  do 
if  link[i]  than 
writa  (i:3) ; 

vritaln; 

writa  (nuabar.of .hauriatica : 3) ; 
for  i  :-  1  to  nuabar.of .hauriatica  do 
vrita  (hauriatic [i] :3) ; 
writa  (nuabar.of_vaighta:6) ; 
for  i  :=  1  to  nuabar.of .vaighta  do 
vrita  (vaight [i] :6: 2) ; 

and; 

vritaln; 

if  profila.input  <>  no.fila.naaa  than 
raad.prof ilaa  (profila.input); 
if  diatribution.input  <>  no.fila.naaa  than 
raad.diatributiona  (diatribution.input) ; 
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if  op cod*  >»  3  than 
begin 
new  (•); 

init.aggragata.raaulta  (a) ; 
old.n  :*  0; 
no_n  :*  0; 
and; 

■hi la  (not  aof)  do 
begin 
read  (n) ; 
no.n  : «  no_n  ♦  1; 
raaulta .nin.path. length  :=  n; 
raad.etate  (goal,  nun.poaitiona) ; 
raad.atate  (atart,  nun.poaitiona) ; 
raadln; 

caa*  opcode  of 

0  :  aolv*  (aaarch.nathod) ; 

1  :  aolva.and.print  (aaarch.nathod) ; 

3  :  aolra.and.ganarat*  (aaarch.nathod) ; 

3  :  aolva.and. aggregate  (aaarch.nathod) ; 
and; 

and; 

if  opcode  >=  3  than 
begin 

print.aggragata.raaulta  (old.n, 

nunbar.of .hauriatica , 
hauriatic, 
nunbar.of .weight* , 

a) ; 

diapoaa  (a) ; 
and; 


Application  GRAFER  (v4.0  6-Mar-86  AJC/SRH) 


PROGRAM  GRAFER 

intagar  crvparaa  (12,3),  haur  (16),  doagain 
raal  «t  (11) 

diaanaion  datafl  (25,60,6,11),  pkadlna  (200) 

(k.  n.  typa,  w) 
charactar*6  typaa(6) 

data  (typaa  (i) ,  i-1,6)  /’XMin' , 'XMaan' . *XMax’ , ’LMin’ . 'LKaan' , 
•LMaxV 


call  raaddata  (datafl,  naxn,  maxhaur,  haur,  wt,  aaxwt) 

call  aanu  (maxhaur,  haur,  aaxwt,  wt,  typaa,  ncurvaa,  crvparaa, 
aaxn,  option) 


call  plottypaO 

call  aataxia  (ncuryaa,  crvparaa ,  typaa, 

pkadlna,  wt,  aaxwt,  aaxn,  option) 

call  drawcrwa  (haur,  wt,  aaxwt,  typaa,  ncurwaa,  crvparaa, 

datafl,  pkadlna,  aaxn,  option) 

call  haight  (.10) 
if  (option  .aq.  1)  than 

call  lagand  (pkadlna,  ncurvaa  +2,  .2,  2.35) 
alaaif  ((option  .aq.  2)  .or.  (option  .aq.  4))  than 
call  lagand  (pkadlna,  ncurvaa  +1,  .2,  2.26) 

alaa 

call  lagand  (pkadlna,  ncurvaa,  .2,  2.6) 
andif 

call  andpl  (0) 

if  (doagain  ()  .aq.  1)  than 
goto  23 
andif 

call  donapl 

atop 

and 


a************************************************************************ 

aubroutina  aanu  (naxhaur,  haur,  aaxwt,  wt,  typaa,  ncurvaa, 

+  crvparaa,  aaxn,  option) 


•*****•***«**»********•**********««*»******************,*****«****»,***,* 


integer  heur  (16),  ervparae  (13,  3) 
reel  wt  (11) 
cheracter*6  types  (0) 

integer  gettype,  getheur,  getwt,  doagain 


1198 


ncuTTes 
print  • 
print  * 
print  * 
print  * 
print  * 
read  *, 


‘Do  you  want 
‘  1 

*  2 

‘  3 

4 

option 

1) 


X  vra  H  graph’ 

L  rre  H  graph’ 

X  rre  V  graph’ 

L  rre  ¥  graph’ 

.or.  (option  .gt.  4))  then 


3001 


if  ((option  .It. 

print  *,  ’invalid  retponae' 

go  to  1198 

endif 

if  (option  .  eq.  3)  then 

i  =  getheur  (aaxheur,  heur) 
do  3001  j=  1,  6 

ncurvea  =  ncurvea  +  1 
cryparaa  (ncurvea,  1) 
ervparae  (ncurvea,  2) 
ervparne  (ncurvea,  3) 

continue 

elaeif  (option  .eq.  4)  then 

i  *  getheur  (aaxheur,  heur) 
do  3002  1.  6 

ncurvea  =  ncurvea  +  1 
ervparae  (ncurvee,  1) 
ervparae  (ncurvee,  2) 
ervparae  (ncurvea. 


•  4 


3)  =  j  *  4 


3002 

continue 

alee 

1199 

print  *,  'Do 

you 

want : * 

print  * ,  ’ 

1 

All  Actual  K,  one  weight,  one 

print  * , 

2 

All  typea,  one  weight,  one  K’ 

prii-t  • ,  ’ 

3 

Selected  V,  one  K,  one  type’ 

print  • , 

4 

All  V,  one  K,  one  type’ 

print  * , 

6 

Actual  X  vre  Simulated  Ka,’, 

♦ 

’  one  wt,  one  type 

print  * ,  ’ 
read  * ,  iana 

6 

Other  variation  graph’ 

if  ((iana  .It 

.  1)  or.  (iana  .gt.  6))  then 

print  *,  *  Invalid  reaponae’ 

go  to  1199 

endif 

if  (iana  .eq.  1)  then 

j  *  gettype  (typae,  option) 
i  *  getvt  (aaxvt,  vt) 
do  186  k-1,  3 

ervparae  (k,  1)  =  j 


crv par**  (k.  2)  =  k 
crvparaa  (k,  3)  =  i 

continue 
ncurvaa  =  3 

alaaif  (iana  .  aq.  2)  than 

i  =  gathaur  (naxhaur,  haur) 
j  =  getwt  (naxwt,  vt) 
il  •  option  •  3 
k2  -  3 

do  186  kl  -  1.  3 

crvparaa  (k2,  1)  =  il  -  kl  +  1 
crvparaa  (k2,  2)  =  i 
crvparaa  (k2,  3)  =  j 
k2  *  k2  -  1 

continue 
ncurvaa  •  3 

alaaif  (iana  .aq.  3)  than 

i  =  gathaur  (naxhaur,  haur) 
j  =  gattypa  (typaa,  option) 
do  1871  k=i.  4 

crvparaa  (k,  1)  =  j 

crvparaa  (k.  2)  =  i 

continue 

crvparaa  (1,  3)  -  1 
crvparaa  (2.  3)  =2 
crvparaa  (3,  3)  =4 
crvparaa  (4,  3)  *7 
ncurvaa  «  4 

alaaif  (iana  .aq.  4)  than 

i  =  gathaur  (naxhaur,  haur) 
j  =  gattypa  (typaa,  option) 
do  1872  k=l ,  naxwt 

crvparaa  (k,  1)  =  j 

crvparaa  (k,  2)  =  i 

crvparaa  (k,  3)  =  k 

continue 
ncurvaa  *  naxwt 

alaaif  (iana  .aq.  5)  than 

i  =  gathaur  (naxhaur,  haur) 
j  =  gattypa  (typaa,  option) 

1  =  gatvt  (naxwt,  wt) 
do  188  k*l,  naxhaur  /  3 

crvparaa  (k,  1)  =  j 
crvparaa  (k,  2)  =  i  +  3*(k-l) 
crvparaa  (k,  3)  =  1 

continue 

ncurvaa  »  naxhaur  /  3 

alaa 

again  “  1 

if  (again  .aq.  1)  than 
ncurvaa  *  ncurvaa  *  1 

crvparaa  (ncurvaa,  1)  *  gattypa  (typaa,  option) 
crvparaa  (ncurvaa,  2)  =  gathaur  (naxhaur,  hour) 


cr-  piiM  (ncurvaa,  3)  -  gatvt  (naxwt,  at) 
•gain  *  doagain  () 
go  to  300 
andif 

andif 

aadif 

ratun 

and 


•  aubroutina  aataxia  (ncurvaa,  crvparaa,  typaa,  pkadlna,  at, 

+  aaxvt,  aaxa,  option) 

>•••*•*•••*••**•**•**•••••••*••••••**•••••*••***•*••*•*****•••••*•**•** 

intagar  crvparaa  (13,  3),  itaxt  (4) 

diaanaion  pkadlna  (300) 

raal  at  (11) 

charaetar*6  typaa  (6) 

character* 16  titla 

diaanaion  dataf 1(36, SO, 0, 11) ,  xarray(36),  yarray(SS),  ybrfrat(30) 
(k,  n,  typa ,  a) 

data  (ybrfrat  (i) .  i»l.  30)  /l.  S,  14,  33,  44,  66.  113,  173, 

♦  363,  377,  603,  033,  1365,  1661,  3014,  3364,  3343, 

+  3403,  3614,  3610/ 

xdiaan  =4.0 

ydiaan  =  3.0 

call  raaat  (‘all’) 

call  triplx 

call  haight  ( . 17) 

call  araa3d  (xdimon,  ydiaan) 

call  aclpic  (.8) 

call  dot 

if  (option  .oq.  1)  than 

call  graf  (0.0,  30.0,  30.0,  1.0,  1.0,  ydiaan) 
call  haight  (0.08) 

call  xgraxa  (0.0,  0.1,  1.0,  xdiaan,  *  -100,  0.0,  0.0) 

call  xticka  (6) 
call  haight  ( . 17) 
call  xintax 

call  xgraxa  (0.0,  6.0,  float(aaxn),  xdiaan, 

+  'Dapth  of  Coal  Of)*'.  100,  0.0,  0.0) 

call  yaxang  (0.0) 
call  ylgaxa  (1.0,  0.76,  ydiaan, 

+  ‘Hodaa  Expandad  (X)l*,  100,  0.0,  0.0) 

call  haight  ( . 10) 

call  linaa  OoptiaaH',  pkadlna,  1) 
call  linaa  ('braadth  first!* ,  pkadlna,  3) 
iantry  =  3 
do  50  i“l ,  naxn 

xarray  (i)  =  float  (i) 
yarray  (i)  =  float  (i) 

60  continua 


call  reaet  ('dot') 

***  NOV  LABEL  THE  LINES  *** 
do  100  i=l ,  ncurvaa 

jl  =  crvparna  (i,  1) 
j2  3  crvparna  (i,  3) 
titla  *  typaa  (jl) 
j3  3  indox  (titla,  ’  ’) 

if  ((option  .aq.  1)  .or.  (option  .aq.  2))  than 

vrita  (titla  (j3:)a  fnt=lll)  crvparna  (i,  2),  at  (j2) 

111  format  C.K*  .  12  .  ,  F4.2  .  ’$•  ) 

road  (titla,  *(4A4)‘)  itaxt 

call  linaa  (itaxt,  pkadlna,  i  +  iantry) 

alaa 

write  (titla  (j3:),  fnt=112)  crvparna  (i,  2),  j2 

112  format  C.E’  .  12  .  1 ,N='  ,  12  .  ’$•  ) 
road  (titla,  *(4A4)’)  itaxt 

call  linaa  (itaxt,  pkadlna,  i  +  iantry) 
andif 

100  continua 
raturn 
and 


************************************************************************* 
subroutina  drawcrva  (haur,  wt,  naxwt,  typaa,  ncurvaa, 

+  crvparna,  datafl,  pkadlna,  maxn,  option) 

**********************************************  *  *  ************************* 
integer  haur  (IS),  crvparna  (12,3) 

raal  wt  (11),  datafl(2S,60,6,ll) ,  xarray  (2S) ,  yarray  (26) 
character*6  typaa  (6) 
dimanaion  pkadlna  (200) 

**loop  for  aach  antry  in  crvparna 
■  narkit  =  14 

if  ((option  .aq.  1)  .or.  (option  .aq.  2))  then 
do  301  j=l ,  ncurvaa 

type  =  crvparna  ( j ,  1) 
h  3  crvparna  (j ,  2) 
v  =  crvparna  (j ,  3) 
do  300  i3l ,  naxn 

xarray  (i)  =  float  (i) 
if  (option  .aq.  1)  than 

yarray (i)  3  datafl  (h,  i,  type,  w) 
alaaif  (option  aq.  2)  than 

yarray(i)  =  datafl  (h,  i,  type,  w) 

♦  /  float  (i) 

andif 

300  continua 

call  narkar  (narkit) 

call  curve  (xarray,  yarray,  naxn,  1) 

markit  3  narkit  -  1 

301  continua 
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do  330  j=l ,  ncurvaa 

typo  =  crvparas  (j,  1) 
h  =  crvparma  (j ,  2) 
n  =  crvparma  ( j ,  3) 
do  331  i=l ,  aaxwt 

xarray(i)  =  wt(i) 
if  (option  .oq.  3)  than 

yarray(i)  =  datafl(h,  n,  typo,  i) 

olao 

yarray(i)  =  dataflCh,  n,  typa,  i)  /  n 
andif 

331  continua 

call  aarkar  (aarkit) 
call  cutto  (xarray,  yarray,  aaxwt,  1) 
aarkit  =  aarkit  -  1 
330  continua 

andif 
ratum 
and 

************************************************************************ 
aubroutina  raaddata  (datafl,  aaxn,  aaxhaur,  hour,  wt,  aaxwt) 

a*********************************************************************** 
diaanaion  datafl (26, BO, 8, 11) 
raal  wt  (11) 
intagar  hour  (16) 

******  load  data  fila  array  with  aggragata  valuoa  *************** 
opon  (1,  f ila=’ aaarch. out  * , »tatua= ’ old" , arr=106) 

*****  acan  paat  firat  waluoa  (gat  waighta  and  nua  of  wta) 

road  (1 , * , and=10S)  nodaa,  nlinka,  (links,  i=l,nlinka) 
road  (l,*,ond=106)  maxhour,  (hour  (i),  i=l .aaxhaur) , 

+  aaxwt,  (wt  (i) .  i=l,  aaxwt) 

100  road  (l,*,ands110) 

+  (n,  k,  (datafl  (k,n,ln,i),  i=l,  aaxwt),  ln=l,  6) 

go  to  100 
110  closa  (1) 
aaxn  =  n 
go  to  100 

••••*  abnormal  fila  condition 
106  print  *,  'arror  with  input  fila' 
atop 

100  and 

•••a******************************************************************** 
aubroutina  plottypa  () 

•••••••*•*•»•••*••**••*•**•*•****•**********•***•***************«******* 

print  *,’Do  you  want  plot 
print  * , ’ 
road  * , iana 
if  (iana  . aq.  2)  than 


1:  On  Tarminal' 
2:  Printad* 


call  qaa 

print  •,'rananbar  to  typa  CLASER  whan  dona’ 

print  * . ’ SET  TERMINAL  UP  FOR  TEKTRONICS  MODE  NOW 

print  * , ’  Typa  1  whan  raady  * 

raad  *,  anawar 

call  takall (4010, 480, 0,0,0) 


andif 

raturn 


intagar  function  gattypa(typaa,  option) 

•ft************************************************************* 

charactar*6  typaa(8) 

if  (option  .aq.  1)  than 
nin  =  1 
max  =  3 

alaa 

Bin  =  4 
Bax  =  6 
andif 

1130  print  *,‘apacify  a  plot  typa:’ 

writa  (*,fBt=1121)  (i,  typaa  (i) ,  i=Bin,max) 

1121  forBatdOx.il,’:  ’ , A5 , ‘  wra  N ’ ) 
raad  *,  anawar 

if  ((anawar  .It.  Bin)  .or.  (anawar  .gt.  max))  than 
print  Invalid  option’ 
go  to  1120 
andif 

gattypa  =  anawar 
raturn 


intagar  function  gathaur  (naxhaur,  haur) 
*********************************************************** 
intagar  haur  (IS) 


1126  print  *.’aalact  a  hauriatic’ 

writa  (*,  fmt=112ft)  (i,  haur  (i),  i=l,  naxhaur) 

1128  f oraat  (10X,  12,  ’:  K’ ,  12) 

raad  * ,  anawar 

if  ((anawar  .It.  1)  .or.  (anawar  .gt.  naxhaur))  than 
print  *, ’Invalid  option’ 
go  to  112S 
andif 

gathaur  =  haur (anawar) 
raturn 


Application  GRAFPROF  (t4.0  fl-Mar-86  AJC/SRH) 


PROGRAM  GRAFPROF 

integer  crrparna  (12,2),  hour  (2S) 
raal  wt  (11) 
dinenaion  pkadlna  (200) 
dinenaion  profile  (26,60,4) 

dinanaion  hiato  (26,60,76),  porcanta  (26,60,76) 
characters  types (3) 

data  (types  (i) ,  i-1,3)  /'KNin' , 'KNaan' . ’KNax’/ 

call  raadprof  (profils,  percents,  histo,  hour, 

naxheur,  naxn,  typaprof) 

call  nanu  (naxheur,  hour,  types,  ncurres,  crrparna, 
profile,  naxn,  option) 


call  plottypeO 

if  (option  .eq.  3)  than 

ihaur  ■  crrparna (1 ,2) 

call  drav3d  (profile,  percents,  histo, 

iheur,  naxn,  typaprof) 

else 

call  setaxis  (naxn,  typaprof) 

call  drawerrs  (hour,  types,  ncurres,  crrparna. 

pkedlns,  naxn,  profile,  option) 
if  (option  .eq.  4)  then 
call  endpl  (0) 
iheur  =  crrparna (1 ,2) 
call  drar3d  (profile,  percents,  histo, 

iheur,  naxn,  typaprof) 

endif 

endif 
call  endpl  (0) 

print  *,  ’Do  another?  0:  Mo* 
print  * ,  '  1 :  Yes ’ 

read  * ,  ians 
if  (ians  .eq.  1)  then 
go  to  11 

else 


call  donepl 
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•♦••••••a**************************************************************** 

aubroutina  mdu  (naxhaur ,  hsur,  typas,  ncurvaa, 

+  crvparna,  profile,  naxn,  option) 

************************************************************************* 

dinenaion  profila(25,  60,  4) 
intagar  haur(26),  crvparna (12,  2) 
character *6  typos (3) 

intagar  gattypa,  gathaur,  gatvt,  doagain 


print  * ,  *  Do  you  want : * 
print  *.  '  1:  All  K,  ona  typo* 

print  *,  *  2:  Ona  I.  all  typas,  inel  standard  deviation* 

print  *,  '  3:  3-D  Profilo  for  ona  JC* 

print  *,  '  4:  Full  aat  for  ona  K  (options  2  and  3)* 

print  *,  *  6:  Othar  variation  graph* 

road  *,  option 

if  ((option  .It.  1)  .or.  (option  .gt.  6))  than 
print  * .  *  Invalid  raaponsa’ 
go  to  1100 
andif 

if  (option  .  aq.  1)  than 

j  *  gattypa  (typas) 
do  186  k=l,  naxhaur 

crvparas  (k,  1)  •  j 

crvparna  (k,  2)  =  haur(k) 

continua 

ncurvaa  =  naxhaur 

alsaif  ((option  .aq.  2)  .or.  (option  .aq.  4))  than 
i  *  gathaur  (naxhaur,  hsur) 
do  186  k  -  1,  3 

crvparna  (k,  1)  =  k 

crvparna  (k,  2)  =  i 

continua 
ncurvaa  =  3 

alaaif  (option  .aq.  3)  than 

crvparna  (1,2)  =  gathaur  (naxhaur,  hsur) 

alas 

again  ■  1 

if  (again  .aq.  1)  than 
ncurvaa  *  ncurvaa  *  1 
crvparna  (ncurvaa,  1)  *  gattypa  (typas) 
crvparna  (ncurvaa,  2)  *  gathaur  (naxhaur,  hsur) 
again  *  doagain  () 
go  to  200 
andif 

andif 


.> 


return 

•nd 

************************************************************************ 
subroutine  readprof  (profile,  percents,  hieto, 

+  beur,  maxheur ,  nexn,  typeprof) 

dimension  profile  (26,50,4) 
integer  beur  (26) 

dimension  bisto  (25,60,76).  percents  (26,60,76) 

reel  ksum,  k2eum,  neum 

character* 15  cbar 

maxheur  *=  26 

maxn  3  60 

maxk  3  76 

do  21  i=i, maxheur 
beur(i)  3  0 
do  21  j=l ,  maxn 
do  21  k-1,  maxk 

histo  (i, j ,k)  3  0.0 
percents  (i.j.k)  =  0.0 

21  continue 

22  print  *,  'Do  you  want  1:  Source  profiles' 

print  *,  *  2:  Run  profiles’ 

read  *,  typeprof 

if  ((typeprof  .It.  1)  .or.  (typeprof  .gt.  2))  then 
print  *,  'Invalid  option’ 
go  to  22 
endif 

if  (typeprof  .  eq.  1)  then 

open  (2,  file3 'profile. pro’ ,statuss’old’ , err=106) 

sloe 

open  (2,  file3’ profile. run’ ,status3’old‘ ,err=106) 
endif 

00  read  (2,*,end3lll)  char,  heuristic,  entries 

maxn  3  0 

do  102  i»l, entries 

read  (2,*,end3lll)  n,  k,  percent,  count 
histo  (heuristic,  n+1,  k+1)  3  count 
percents  (heuristic,  n+1,  k+1)  =  percent  /  100.0 
heur (heuristic)  3  heuristic 
if  (n  .gt.  maxn)  then 
maxn  3  n 
endif 

102  continue 

go  to  00 

*****  abnormal  file  condition 
106  print  *,  'error  with  input  file’ 
stop 
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111  dost  (2) 

j  -  0 

do  67  i=l ,  aaxhour 

it  (h«ur(i)  . gt .  0)  than 

j  *  i 

endif 

67  continue 

aaxhour  3  j 

do  66  i»l ,  uxhauz 

do  66  j=i,  naxn  +  1 

profilo  (i.j.l)  3  9009.0 
profile  (i.j.2)  -  0.0 
profilo  (i, j ,3)  =  0.0 
profilo  (i,j,4)  =  0.0 
neua  =  0 
kaun  =  0 
k2aua  3  0 
do  66  k=l .  aoxk 

if  (hiato  (i,j,k)  .bo.  0)  than 
noua  3  noua  +  hioto  (i,j,k) 
koua  3  koua  +  (k-1)  *  hioto  (i.j.k) 
k2oua  3  k2oua  +  (k-1)  •  (k-1)  *  hioto  (i.j.k) 
if  ((k-1)  .It.  profiled,  j  .  1))  than 
profiled. j ,1)  3  k  -  1 
oadif 

if  ((k-1)  .gt.  profiled,  j  .3))  thaa 
profiled, j, 3)  3  k  -  1 
oadif 
oadif 

66  continue 

if  (noua  .gt.  0)  then 

profiled,  j  ,2)  3  koua  /  noua 

alee 

profiled,  j  ,1)  3  0 
oadif 

if  (noun  .gt.  1)  then 
profiled,  j  ,4)  3 

+  oqrt(abe(ksun*ksun/noun  -  k2aua) / (neua-1) ) 

ondif 

66  continue 
return 
end 


oubroutino  plottype  0 


print  •,’Do  you  went  plot  1:  0a  Terminal' 

print  2:  Printed’ 

read  *,iaaa 
if  (iaaa  . eq.  2)  then 
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call  qae 

print  *,‘raaaab#r  to  (laaar  filanaaa)  whan  dona’ 

alaa 

print  * , *  SET  TERMINAL  UP  FOR  TEKTRONICS  MODE  NOV 
print  * , '  Entar  1  whan  raady : ’ 
raad  *,  iana 

call  takall (4010, 480 ,0,0,0) 

andif 

return 

and 

•a********************************************************************** 
aubroutina  aataxia  (aaxn,  typeprof) 

************************************************************************ 

call  raaat  (’all*) 
call  triplx 
call  height  (.17) 
xdiaan  =  4.0 
ydiaan  =  3.0 

call  area2d  (xdiaan,  ydiaan) 

call  head2  (typeprof) 

call  aclpic  (.8) 

call  dot 

call  aarkar  (13) 

call  ynaae  (’Eat  Diat  (K)$’,  100) 
call  yticka  (6) 
call  yaxang  (0.0) 
call  yintax 

call  graf  (0.0,  6.0,  float(aaxn),  0.0,  6.0,  BOO) 
call  height  (0.08) 

call  xgraxa  (0.0,  0.1,  1.0,  xdiaan,  ’  $',  -100,  0.0,  0.0) 
call  xticka  (B) 
call  height  (0.17) 
call  xintax 

call  xgraxa  (0.0,  6.0,  float(aaxn),  xdiaan, 

+  ’True  Diet  (i)V.  100,  0.0,  0.0) 

return 
and 
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aubroutina  drawerva  (hour,  typee,  ncurwaa, 

+  crvparaa,  pkadlna,  aaxn,  profile,  option) 


integer  hour (26) ,  erwparna  (12,2),  itext(4) 
real  xarray  (60) ,  yarray  (60) 
character*5  typaa  (3) 
character* 16  title 

diaanaion  pkadlna  (200),  profile  (25,  60,  4) 
call  height  ( . 10) 

call  linaa  (‘optiaalf,  pkadlna,  1) 
do  52  i*i,  aaxn  +  1 


Xm+Jk*  .IhP  JtJ  J 


xarray  (i)  *  float  (i  -  1) 
yarray  (i)  *  float  (i  -  1) 

62  coatiaua 

call  curra  (xarray,  yarray,  aaxa  >1,  1) 
call  raaat  ('dot') 

do  100  i*l,  acurToa 

jl  »  crrparaa  (i,  1) 

100  coatiaua 


**  for  aach  aatry  in  crrpsras 
aarkit  -  16 
do  301  j*l.  acurraa 
*•  oat  up  aaaa  for  lagaad 

typa  *  crrparaa  (j ,  1) 
titla  ■  typaa  (typa) 
j 3  «  iadax  (titla,  ‘  *) 

writa  (titla  (j3:),  fat*112)  crrparaa  ( j ,  2) 

112  format  (\I*  .  12  ,  *$*) 

raad  (titla,  *(4A4)‘)  itaxt 
call  liaaa  (itaxt,  pkadlaa,  j  +  1) 
hauxiatic  ■  crrparaa  ( j ,  2) 

*•  gat  data  for  tha  curra  aad  draw  it 
do  300  i“l ,  aaxa  ♦  1 

xarray  (i)  *  float  (i  -  l) 

yarray  (i)  *  profile  (heuristic,  i,  typa) 

300  coatiaua 

call  aarkar  (aarkit) 

call  curra  (xarray,  yarray,  aaxa  +1,1) 
aarkit  -  aarkit  +  1 

301  coatiaua 

if  ((optioa  .aq.  2)  .or.  (optioa  .aq.  4))  thaa 

call  liaaa (’♦/-  1  Std.  Dar pkadlaa ,  acurraa  +  2) 
do  302  i»l,  aaxa  +  1 

yarray (i)  =  profila  (heuristic,  i,  2) 

+  +  profila  (hauriatic,  i,  4) 

302  coatiaua 
call  dot 

call  aarkar (aarkit) 

call  curra  (xarray,  yarray,  aaxa  +1,  1) 
do  303  i*l,  aaxa  +  1 

yarray(i)  ■  profila  (hauriatic,  i,  2) 

+  -  profila  (hauriatic,  i,  4) 

303  coatiaua 

call  aarkar  (aarkit) 

call  curra  (xarray,  yarray,  aaxa  +1,1) 
aadif 


call  haight  (.10) 

if  ((optioa  .aq.  2)  .or.  (optioa  .aq.  4))  thaa 
call  lagaad  (pkadlaa,  acurraa  +2,  .2,  2.26) 

alaa 

call  lagaad  (pkadlaa,  ncurraa  +  1,  .2,  2.26) 


irw  iron 


•adif 

return 

•nd 


*••••*•••****•**•**«•**••*•*•**•*•*•***********»****•*•*»•**•* 


integer  function  gettype (types) 

*************************************************************** 
ch«xecter*5  types (3) 

1120  print  specify  s  plot  type:’ 

write  (*,fmt“1121)  (i,  types  (i) ,  i=l,3) 

1121  format  (10x.Il.*:  ’.AS.’  tts  H’) 

reed  *.  answer 

if  ((answer  .It.  1)  .or.  (answer  .gt.  3)>  then 
print  *, *  Invalid  option’ 
go  to  1120 
endif 

gettype  *  answer 

return 

end 

•A********************************************************* 

integer  function  getheur  (naxheur,  hour) 
integer  heur(2E) 

1126  print  *, ’select  a  heuristic* 
do  1126  i-1,  16 

if  (heur(i)  .ne.  0)  then 

write  (*,  fnt-1127)  i,  heur(i) 
endif 

1126  continue 

1127  format  (10X,  12.  ’:  K ‘ ,  12) 
read  *,  answer 

if  ((answer  .It.  1) 

♦  .or.  (answer  .gt.  naxheur) 

♦  .or.  (hour (answer)  . eq.  0))  then 
print  *, ’Invalid  option’ 

go  to  1126 
endif 

getheur  “  answer 

return 

end 


integer  function  doagain 


1140  print  *,’do  you  want  to  put  another  curve  on  this  plot?’ 
print  * , ’  0 :  No ’ 

print  * , ’  1 :  Yes ’ 
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read  *,  answer 

if  ((answer  .It.  0)  .or.  (answer  .  gt.  1))  then 
print  *, ‘Invalid  option* 
go  to  1140 
endif 

doagain  =  answer 

return 

end 


subroutine  draw3d  (profile,  percents,  histo,  iheur, 

♦  naxn,  typeprof) 

integer  heur  (26) 
diaension  profile  (26,50,4) 

diaension  histo  (25,60,75),  percents  (26,60,76) 

diaension  zarray  (21,51) 

diaension  xray (21) ,  yray(61) ,  zray(51) 

call  reset  (’all*) 
call  triplx 
call  height  (.17) 
call  area2d  (4.0,  3.6) 
call  headl  (iheur,  typeprof) 
call  sclpic  (.8) 
call  Barker  (13) 
call  height  (0.10) 
call  x3naae  ('True  Dist  (i)8*.  100) 
call  y3naae  ('Eat  Dist  (K)$*.  100) 
call  z3naae  ( *Freq (%)$', 100) 
set  the  perspective  point  for  the  graph  *•* 
print  *, 'input  X  perspective:* 
read  * , xx 

print  *,  'input  Y  perspective:* 
read  •.  yy 

print  *,  'input  Z  perspective:' 
read  *,  zz 

call  vuabs  (xx,  yy,  zz) 

call  vola3d  (4.0,  4.0,  1.0) 

call  yticks  (6) 

call  xticka  (6) 

call  zaxang  (-00.0) 

call  xintax 

call  yintax 

call  graf3d  (0.0,  6.0,  float(naxn), 

♦  0.0,  6.0,  50.0, 

♦  0.0.  0.26,  1.0) 

do  11  n«l,  naxn  +  1 

do  11  k«l.  61 

zarray(n.k)  »  percents (iheur ,  n,  k) 

11  continue 

call  suraat (zarray ,  1,  naxn  +  1,  1,  61,  0) 
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call  grfiti  (0.0, 4. 0,0.0,  4. 0,4. 0,0.0,  0.0, 4. 0,1.0) 
call  araaOd  (4. 0,1.0) 
call  yaxaag  (0.0) 

call  graf  (0. 0,6.0, float (maxa) ,  0.0,0.26,1.0) 

do  111  n=l ,  aaxn  +  1 
xray  (n)  =  n  -  1 
yray  (a)  *  0.0 
do  111  k=i.  61 

if  (parcaata(ihour ,  a,  k)  . gt.  yray (a))  thaa 
yray (a)  =  parcaata(ihaur,  a,  k) 
aadif 

coatiaua 

call  curva  (xray,  yray,  aaxa  ♦  1,  1) 

call  dot 

call  grid  (1,1) 

call  raoat('dot') 

call  haight  (0.17) 

ratura 

aad 


************************************************************************ 


aubroutiaa  haadl  (ihaur,  typaprof) 

************************************************************************ 
iatagar  itaxt (6) 
charactar*24  titla 

if  (typaprof  . aq.  1)  thaa 

titla  *  'Sourca  Profila,  K' 
writa  (titla  (18:),  fat*112)  ihaur 

alaa 

titla  *  'Run  Profila,  K* 

writa  (titla  (16:),  fat=112)  ihaur 

aadif 

112  foraat  (  12  .  '$') 

raad  (titla,  ’(8A4)')  itaxt 
call  haadia  (itaxt,  100,  1.6,  1) 
ratura 
aad 


I************************************************************************ 

aubroutiaa  haad2  (typaprof) 

************************************************************************ 

if  (typaprof  .aq.  1)  thaa 

call  haadia  ( 'Sourca  Profila!' ,  100,  1.6,  1) 

alaa 

call  haadia  ( 'Rua  Prof ila$* ,  100,  1.6.  1) 
aadif 
ratura 
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*********************************************************************** 


APPENDIX  D  * 

* 

Distribution  Disk  Contents  • 

* 

*********************************************************************** 


General  Beads  World  Utilities  Nodules: 


UTILITIES. PAS 
CONTROL. PAS 
HEURBAS . PAS 
STATISTIC. PAS 


Utilities  nodule 
Control  structures  nodule 
Basic  heuristics  module 
Statistics  nodule 


Beads  World  Utilities  Definition  Files: 


UTILITIES. DEF 
CONTROL. DEF 
HEURISTIC. DEF 
STATISTIC. DEF 


Utilities  definitions 
Control  structure  definitions 
Heuristic  definitions 
Statistic  definitions 


Applications  Nodules: 

—  Beads  World  graph  generator 
--  A*  puzzle  solver 
--  Additional  heuristics 
—  Graphics  package  (solutions) 
--  Graphics  package  (profiles) 


GRAPH. PAS 
SOLVE. PAS 
HEURNOD . PAS 
GRAFER . FOR 
GRAFPROF . FOR 


V  V  V  .•  V  W  V  V  V  V  V  V  V  V 


*********************************************************** 


*  * 

*  APPENDIX  E  * 

*  * 

*  KEY  TO  GRAPH  SYMBOLS  AND  TERMS  * 

*  * 


*********************************************************** 

Symbol  Meaning 

MIN-  Minimum  of  all  the  values  observed  in  the  sample. 

MAX-  Maximum  of  all  the  values  observed  in  the  sample. 

MEAN-  The  average  of  all  the  values  in  the  sample. 

N-  Depth  or  level  in  the  state  space. 

I-  Actual  minimal  distance  of  a  node  from  the  goal. 

X-  Number  of  nodes  expanded. 

L-  Normalized  solution  path  length. 

W-  The  weight  used  in  the  Weighted  A*  algorithm. 

K-  The  estimate  provided  by  the  heuristic  of  I. 
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